!pip install tensorflow
Collecting tensorflow
Using cached tensorflow-2.16.1-cp311-cp311-win_amd64.whl.metadata (3.5 kB)
Collecting tensorflow-intel==2.16.1 (from tensorflow)
Using cached tensorflow_intel-2.16.1-cp311-cp311-win_amd64.whl.metadata (5.0 kB)
Collecting absl-py>=1.0.0 (from tensorflow-intel==2.16.1->tensorflow)
Using cached absl_py-2.1.0-py3-none-any.whl.metadata (2.3 kB)
Collecting astunparse>=1.6.0 (from tensorflow-intel==2.16.1->tensorflow)
Downloading astunparse-1.6.3-py2.py3-none-any.whl.metadata (4.4 kB)
Collecting flatbuffers>=23.5.26 (from tensorflow-intel==2.16.1->tensorflow)
Downloading flatbuffers-24.3.25-py2.py3-none-any.whl.metadata (850 bytes)
Collecting gast!=0.5.0,!=0.5.1,!=0.5.2,>=0.2.1 (from tensorflow-intel==2.16.1->tensorflow)
Using cached gast-0.5.4-py3-none-any.whl.metadata (1.3 kB)
Collecting google-pasta>=0.1.1 (from tensorflow-intel==2.16.1->tensorflow)
Downloading google_pasta-0.2.0-py3-none-any.whl.metadata (814 bytes)
Collecting h5py>=3.10.0 (from tensorflow-intel==2.16.1->tensorflow)
Downloading h5py-3.11.0-cp311-cp311-win_amd64.whl.metadata (2.5 kB)
Collecting libclang>=13.0.0 (from tensorflow-intel==2.16.1->tensorflow)
Downloading libclang-18.1.1-py2.py3-none-win_amd64.whl.metadata (5.3 kB)
Collecting ml-dtypes~=0.3.1 (from tensorflow-intel==2.16.1->tensorflow)
Using cached ml_dtypes-0.3.2-cp311-cp311-win_amd64.whl.metadata (20 kB)
Collecting opt-einsum>=2.3.2 (from tensorflow-intel==2.16.1->tensorflow)
Using cached opt_einsum-3.3.0-py3-none-any.whl.metadata (6.5 kB)
Requirement already satisfied: packaging in c:\users\engasser ophélie\desktop\cv\venv\lib\site-packages (from tensorflow-intel==2.16.1->tensorflow) (23.2)
Collecting protobuf!=4.21.0,!=4.21.1,!=4.21.2,!=4.21.3,!=4.21.4,!=4.21.5,<5.0.0dev,>=3.20.3 (from tensorflow-intel==2.16.1->tensorflow)
Using cached protobuf-4.25.3-cp310-abi3-win_amd64.whl.metadata (541 bytes)
Requirement already satisfied: requests<3,>=2.21.0 in c:\users\engasser ophélie\desktop\cv\venv\lib\site-packages (from tensorflow-intel==2.16.1->tensorflow) (2.31.0)
Requirement already satisfied: setuptools in c:\users\engasser ophélie\desktop\cv\venv\lib\site-packages (from tensorflow-intel==2.16.1->tensorflow) (65.5.0)
Requirement already satisfied: six>=1.12.0 in c:\users\engasser ophélie\desktop\cv\venv\lib\site-packages (from tensorflow-intel==2.16.1->tensorflow) (1.16.0)
Collecting termcolor>=1.1.0 (from tensorflow-intel==2.16.1->tensorflow)
Using cached termcolor-2.4.0-py3-none-any.whl.metadata (6.1 kB)
Requirement already satisfied: typing-extensions>=3.6.6 in c:\users\engasser ophélie\desktop\cv\venv\lib\site-packages (from tensorflow-intel==2.16.1->tensorflow) (4.10.0)
Collecting wrapt>=1.11.0 (from tensorflow-intel==2.16.1->tensorflow)
Downloading wrapt-1.16.0-cp311-cp311-win_amd64.whl.metadata (6.8 kB)
Collecting grpcio<2.0,>=1.24.3 (from tensorflow-intel==2.16.1->tensorflow)
Downloading grpcio-1.64.0-cp311-cp311-win_amd64.whl.metadata (3.4 kB)
Collecting tensorboard<2.17,>=2.16 (from tensorflow-intel==2.16.1->tensorflow)
Using cached tensorboard-2.16.2-py3-none-any.whl.metadata (1.6 kB)
Collecting keras>=3.0.0 (from tensorflow-intel==2.16.1->tensorflow)
Using cached keras-3.3.3-py3-none-any.whl.metadata (5.7 kB)
Collecting tensorflow-io-gcs-filesystem>=0.23.1 (from tensorflow-intel==2.16.1->tensorflow)
Using cached tensorflow_io_gcs_filesystem-0.31.0-cp311-cp311-win_amd64.whl.metadata (14 kB)
Requirement already satisfied: numpy<2.0.0,>=1.23.5 in c:\users\engasser ophélie\desktop\cv\venv\lib\site-packages (from tensorflow-intel==2.16.1->tensorflow) (1.26.4)
Collecting wheel<1.0,>=0.23.0 (from astunparse>=1.6.0->tensorflow-intel==2.16.1->tensorflow)
Using cached wheel-0.43.0-py3-none-any.whl.metadata (2.2 kB)
Collecting rich (from keras>=3.0.0->tensorflow-intel==2.16.1->tensorflow)
Using cached rich-13.7.1-py3-none-any.whl.metadata (18 kB)
Collecting namex (from keras>=3.0.0->tensorflow-intel==2.16.1->tensorflow)
Using cached namex-0.0.8-py3-none-any.whl.metadata (246 bytes)
Collecting optree (from keras>=3.0.0->tensorflow-intel==2.16.1->tensorflow)
Using cached optree-0.11.0-cp311-cp311-win_amd64.whl.metadata (46 kB)
Requirement already satisfied: charset-normalizer<4,>=2 in c:\users\engasser ophélie\desktop\cv\venv\lib\site-packages (from requests<3,>=2.21.0->tensorflow-intel==2.16.1->tensorflow) (3.3.2)
Requirement already satisfied: idna<4,>=2.5 in c:\users\engasser ophélie\desktop\cv\venv\lib\site-packages (from requests<3,>=2.21.0->tensorflow-intel==2.16.1->tensorflow) (3.6)
Requirement already satisfied: urllib3<3,>=1.21.1 in c:\users\engasser ophélie\desktop\cv\venv\lib\site-packages (from requests<3,>=2.21.0->tensorflow-intel==2.16.1->tensorflow) (2.2.1)
Requirement already satisfied: certifi>=2017.4.17 in c:\users\engasser ophélie\desktop\cv\venv\lib\site-packages (from requests<3,>=2.21.0->tensorflow-intel==2.16.1->tensorflow) (2024.2.2)
Collecting markdown>=2.6.8 (from tensorboard<2.17,>=2.16->tensorflow-intel==2.16.1->tensorflow)
Downloading Markdown-3.6-py3-none-any.whl.metadata (7.0 kB)
Collecting tensorboard-data-server<0.8.0,>=0.7.0 (from tensorboard<2.17,>=2.16->tensorflow-intel==2.16.1->tensorflow)
Using cached tensorboard_data_server-0.7.2-py3-none-any.whl.metadata (1.1 kB)
Collecting werkzeug>=1.0.1 (from tensorboard<2.17,>=2.16->tensorflow-intel==2.16.1->tensorflow)
Downloading werkzeug-3.0.3-py3-none-any.whl.metadata (3.7 kB)
Requirement already satisfied: MarkupSafe>=2.1.1 in c:\users\engasser ophélie\desktop\cv\venv\lib\site-packages (from werkzeug>=1.0.1->tensorboard<2.17,>=2.16->tensorflow-intel==2.16.1->tensorflow) (2.1.5)
Collecting markdown-it-py>=2.2.0 (from rich->keras>=3.0.0->tensorflow-intel==2.16.1->tensorflow)
Using cached markdown_it_py-3.0.0-py3-none-any.whl.metadata (6.9 kB)
Requirement already satisfied: pygments<3.0.0,>=2.13.0 in c:\users\engasser ophélie\desktop\cv\venv\lib\site-packages (from rich->keras>=3.0.0->tensorflow-intel==2.16.1->tensorflow) (2.17.2)
Collecting mdurl~=0.1 (from markdown-it-py>=2.2.0->rich->keras>=3.0.0->tensorflow-intel==2.16.1->tensorflow)
Using cached mdurl-0.1.2-py3-none-any.whl.metadata (1.6 kB)
Using cached tensorflow-2.16.1-cp311-cp311-win_amd64.whl (2.1 kB)
Using cached tensorflow_intel-2.16.1-cp311-cp311-win_amd64.whl (377.0 MB)
Using cached absl_py-2.1.0-py3-none-any.whl (133 kB)
Using cached astunparse-1.6.3-py2.py3-none-any.whl (12 kB)
Downloading flatbuffers-24.3.25-py2.py3-none-any.whl (26 kB)
Using cached gast-0.5.4-py3-none-any.whl (19 kB)
Using cached google_pasta-0.2.0-py3-none-any.whl (57 kB)
Downloading grpcio-1.64.0-cp311-cp311-win_amd64.whl (4.1 MB)
---------------------------------------- 0.0/4.1 MB ? eta -:--:--
--------------------------------------- 0.1/4.1 MB 2.6 MB/s eta 0:00:02
-- ------------------------------------- 0.3/4.1 MB 3.1 MB/s eta 0:00:02
---- ----------------------------------- 0.5/4.1 MB 3.5 MB/s eta 0:00:02
----- ---------------------------------- 0.5/4.1 MB 2.9 MB/s eta 0:00:02
------- -------------------------------- 0.7/4.1 MB 3.3 MB/s eta 0:00:02
--------- ------------------------------ 0.9/4.1 MB 3.5 MB/s eta 0:00:01
----------- ---------------------------- 1.2/4.1 MB 3.8 MB/s eta 0:00:01
-------------- ------------------------- 1.5/4.1 MB 4.0 MB/s eta 0:00:01
---------------- ----------------------- 1.7/4.1 MB 4.4 MB/s eta 0:00:01
------------------- -------------------- 1.9/4.1 MB 4.4 MB/s eta 0:00:01
--------------------- ------------------ 2.2/4.1 MB 4.4 MB/s eta 0:00:01
------------------------ --------------- 2.5/4.1 MB 4.5 MB/s eta 0:00:01
--------------------------- ------------ 2.8/4.1 MB 4.8 MB/s eta 0:00:01
------------------------------ --------- 3.2/4.1 MB 5.0 MB/s eta 0:00:01
---------------------------------- ----- 3.6/4.1 MB 5.3 MB/s eta 0:00:01
-------------------------------------- - 4.0/4.1 MB 5.5 MB/s eta 0:00:01
--------------------------------------- 4.1/4.1 MB 5.5 MB/s eta 0:00:01
---------------------------------------- 4.1/4.1 MB 5.1 MB/s eta 0:00:00
Downloading h5py-3.11.0-cp311-cp311-win_amd64.whl (3.0 MB)
---------------------------------------- 0.0/3.0 MB ? eta -:--:--
--- ------------------------------------ 0.2/3.0 MB 7.3 MB/s eta 0:00:01
---- ----------------------------------- 0.4/3.0 MB 5.6 MB/s eta 0:00:01
-------- ------------------------------- 0.6/3.0 MB 4.2 MB/s eta 0:00:01
--------- ------------------------------ 0.7/3.0 MB 4.4 MB/s eta 0:00:01
--------- ------------------------------ 0.7/3.0 MB 4.4 MB/s eta 0:00:01
---------- ----------------------------- 0.8/3.0 MB 2.7 MB/s eta 0:00:01
------------- -------------------------- 1.0/3.0 MB 3.1 MB/s eta 0:00:01
----------------- ---------------------- 1.3/3.0 MB 3.5 MB/s eta 0:00:01
--------------------- ------------------ 1.6/3.0 MB 3.8 MB/s eta 0:00:01
------------------------ --------------- 1.8/3.0 MB 4.0 MB/s eta 0:00:01
----------------------------- ---------- 2.2/3.0 MB 4.2 MB/s eta 0:00:01
--------------------------------- ------ 2.5/3.0 MB 4.6 MB/s eta 0:00:01
--------------------------------------- 2.9/3.0 MB 4.9 MB/s eta 0:00:01
---------------------------------------- 3.0/3.0 MB 4.6 MB/s eta 0:00:00
Using cached keras-3.3.3-py3-none-any.whl (1.1 MB)
Downloading libclang-18.1.1-py2.py3-none-win_amd64.whl (26.4 MB)
---------------------------------------- 0.0/26.4 MB ? eta -:--:--
---------------------------------------- 0.2/26.4 MB 6.9 MB/s eta 0:00:04
- -------------------------------------- 0.7/26.4 MB 8.4 MB/s eta 0:00:04
- -------------------------------------- 1.0/26.4 MB 8.1 MB/s eta 0:00:04
-- ------------------------------------- 1.3/26.4 MB 7.7 MB/s eta 0:00:04
-- ------------------------------------- 1.9/26.4 MB 8.5 MB/s eta 0:00:03
--- ------------------------------------ 2.4/26.4 MB 9.0 MB/s eta 0:00:03
---- ----------------------------------- 2.8/26.4 MB 9.0 MB/s eta 0:00:03
---- ----------------------------------- 3.3/26.4 MB 9.1 MB/s eta 0:00:03
----- ---------------------------------- 3.6/26.4 MB 9.2 MB/s eta 0:00:03
----- ---------------------------------- 3.6/26.4 MB 9.2 MB/s eta 0:00:03
------ --------------------------------- 4.2/26.4 MB 8.4 MB/s eta 0:00:03
------ --------------------------------- 4.6/26.4 MB 8.3 MB/s eta 0:00:03
------- -------------------------------- 5.0/26.4 MB 8.3 MB/s eta 0:00:03
-------- ------------------------------- 5.4/26.4 MB 8.6 MB/s eta 0:00:03
-------- ------------------------------- 5.8/26.4 MB 8.4 MB/s eta 0:00:03
--------- ------------------------------ 6.1/26.4 MB 8.3 MB/s eta 0:00:03
--------- ------------------------------ 6.6/26.4 MB 8.4 MB/s eta 0:00:03
---------- ----------------------------- 7.1/26.4 MB 8.7 MB/s eta 0:00:03
----------- ---------------------------- 7.5/26.4 MB 8.7 MB/s eta 0:00:03
------------ --------------------------- 8.0/26.4 MB 8.8 MB/s eta 0:00:03
------------ --------------------------- 8.4/26.4 MB 8.8 MB/s eta 0:00:03
------------- -------------------------- 8.8/26.4 MB 8.8 MB/s eta 0:00:03
------------- -------------------------- 9.1/26.4 MB 8.7 MB/s eta 0:00:02
-------------- ------------------------- 9.6/26.4 MB 8.7 MB/s eta 0:00:02
--------------- ------------------------ 10.0/26.4 MB 8.8 MB/s eta 0:00:02
--------------- ------------------------ 10.5/26.4 MB 9.0 MB/s eta 0:00:02
---------------- ----------------------- 10.9/26.4 MB 9.0 MB/s eta 0:00:02
---------------- ----------------------- 11.1/26.4 MB 8.8 MB/s eta 0:00:02
---------------- ----------------------- 11.1/26.4 MB 8.4 MB/s eta 0:00:02
----------------- ---------------------- 11.3/26.4 MB 8.5 MB/s eta 0:00:02
----------------- ---------------------- 11.7/26.4 MB 8.4 MB/s eta 0:00:02
------------------ --------------------- 12.4/26.4 MB 8.4 MB/s eta 0:00:02
------------------- -------------------- 12.8/26.4 MB 8.4 MB/s eta 0:00:02
-------------------- ------------------- 13.3/26.4 MB 8.4 MB/s eta 0:00:02
-------------------- ------------------- 13.7/26.4 MB 8.4 MB/s eta 0:00:02
-------------------- ------------------- 13.8/26.4 MB 8.4 MB/s eta 0:00:02
--------------------- ------------------ 14.0/26.4 MB 8.5 MB/s eta 0:00:02
--------------------- ------------------ 14.3/26.4 MB 8.2 MB/s eta 0:00:02
---------------------- ----------------- 14.6/26.4 MB 8.1 MB/s eta 0:00:02
---------------------- ----------------- 14.8/26.4 MB 8.0 MB/s eta 0:00:02
---------------------- ----------------- 15.0/26.4 MB 7.8 MB/s eta 0:00:02
----------------------- ---------------- 15.3/26.4 MB 7.7 MB/s eta 0:00:02
----------------------- ---------------- 15.6/26.4 MB 7.7 MB/s eta 0:00:02
------------------------ --------------- 16.0/26.4 MB 7.7 MB/s eta 0:00:02
------------------------ --------------- 16.2/26.4 MB 7.6 MB/s eta 0:00:02
------------------------- -------------- 16.5/26.4 MB 7.5 MB/s eta 0:00:02
------------------------- -------------- 16.8/26.4 MB 7.4 MB/s eta 0:00:02
------------------------- -------------- 17.1/26.4 MB 7.3 MB/s eta 0:00:02
-------------------------- ------------- 17.5/26.4 MB 7.3 MB/s eta 0:00:02
--------------------------- ------------ 18.0/26.4 MB 7.3 MB/s eta 0:00:02
--------------------------- ------------ 18.2/26.4 MB 7.2 MB/s eta 0:00:02
---------------------------- ----------- 18.6/26.4 MB 7.1 MB/s eta 0:00:02
---------------------------- ----------- 18.9/26.4 MB 7.1 MB/s eta 0:00:02
---------------------------- ----------- 19.1/26.4 MB 7.0 MB/s eta 0:00:02
----------------------------- ---------- 19.4/26.4 MB 7.0 MB/s eta 0:00:02
----------------------------- ---------- 19.6/26.4 MB 7.0 MB/s eta 0:00:01
------------------------------ --------- 19.9/26.4 MB 6.8 MB/s eta 0:00:01
------------------------------ --------- 20.2/26.4 MB 6.7 MB/s eta 0:00:01
------------------------------- -------- 20.6/26.4 MB 6.7 MB/s eta 0:00:01
------------------------------- -------- 21.0/26.4 MB 6.7 MB/s eta 0:00:01
-------------------------------- ------- 21.3/26.4 MB 6.7 MB/s eta 0:00:01
-------------------------------- ------- 21.7/26.4 MB 7.1 MB/s eta 0:00:01
--------------------------------- ------ 22.1/26.4 MB 7.0 MB/s eta 0:00:01
---------------------------------- ----- 22.5/26.4 MB 6.9 MB/s eta 0:00:01
---------------------------------- ----- 22.9/26.4 MB 6.9 MB/s eta 0:00:01
----------------------------------- ---- 23.2/26.4 MB 6.8 MB/s eta 0:00:01
----------------------------------- ---- 23.5/26.4 MB 6.7 MB/s eta 0:00:01
------------------------------------ --- 23.9/26.4 MB 6.7 MB/s eta 0:00:01
------------------------------------ --- 24.3/26.4 MB 7.0 MB/s eta 0:00:01
------------------------------------- -- 24.8/26.4 MB 7.1 MB/s eta 0:00:01
------------------------------------- -- 25.1/26.4 MB 7.2 MB/s eta 0:00:01
-------------------------------------- - 25.3/26.4 MB 7.3 MB/s eta 0:00:01
-------------------------------------- - 25.6/26.4 MB 7.4 MB/s eta 0:00:01
--------------------------------------- 25.9/26.4 MB 7.3 MB/s eta 0:00:01
--------------------------------------- 26.2/26.4 MB 7.2 MB/s eta 0:00:01
--------------------------------------- 26.4/26.4 MB 7.3 MB/s eta 0:00:01
--------------------------------------- 26.4/26.4 MB 7.3 MB/s eta 0:00:01
--------------------------------------- 26.4/26.4 MB 7.3 MB/s eta 0:00:01
--------------------------------------- 26.4/26.4 MB 7.3 MB/s eta 0:00:01
--------------------------------------- 26.4/26.4 MB 7.3 MB/s eta 0:00:01
--------------------------------------- 26.4/26.4 MB 7.3 MB/s eta 0:00:01
--------------------------------------- 26.4/26.4 MB 7.3 MB/s eta 0:00:01
--------------------------------------- 26.4/26.4 MB 7.3 MB/s eta 0:00:01
--------------------------------------- 26.4/26.4 MB 7.3 MB/s eta 0:00:01
--------------------------------------- 26.4/26.4 MB 7.3 MB/s eta 0:00:01
---------------------------------------- 26.4/26.4 MB 5.4 MB/s eta 0:00:00
Using cached ml_dtypes-0.3.2-cp311-cp311-win_amd64.whl (127 kB)
Using cached opt_einsum-3.3.0-py3-none-any.whl (65 kB)
Using cached protobuf-4.25.3-cp310-abi3-win_amd64.whl (413 kB)
Using cached tensorboard-2.16.2-py3-none-any.whl (5.5 MB)
Using cached tensorflow_io_gcs_filesystem-0.31.0-cp311-cp311-win_amd64.whl (1.5 MB)
Using cached termcolor-2.4.0-py3-none-any.whl (7.7 kB)
Downloading wrapt-1.16.0-cp311-cp311-win_amd64.whl (37 kB)
Downloading Markdown-3.6-py3-none-any.whl (105 kB)
---------------------------------------- 0.0/105.4 kB ? eta -:--:--
---------------------------------------- 105.4/105.4 kB 5.9 MB/s eta 0:00:00
Using cached tensorboard_data_server-0.7.2-py3-none-any.whl (2.4 kB)
Downloading werkzeug-3.0.3-py3-none-any.whl (227 kB)
---------------------------------------- 0.0/227.3 kB ? eta -:--:--
---------------------------------------- 227.3/227.3 kB 7.0 MB/s eta 0:00:00
Using cached wheel-0.43.0-py3-none-any.whl (65 kB)
Using cached namex-0.0.8-py3-none-any.whl (5.8 kB)
Using cached optree-0.11.0-cp311-cp311-win_amd64.whl (245 kB)
Using cached rich-13.7.1-py3-none-any.whl (240 kB)
Using cached markdown_it_py-3.0.0-py3-none-any.whl (87 kB)
Using cached mdurl-0.1.2-py3-none-any.whl (10.0 kB)
Installing collected packages: namex, libclang, flatbuffers, wrapt, wheel, werkzeug, termcolor, tensorflow-io-gcs-filesystem, tensorboard-data-server, protobuf, optree, opt-einsum, ml-dtypes, mdurl, markdown, h5py, grpcio, google-pasta, gast, absl-py, tensorboard, markdown-it-py, astunparse, rich, keras, tensorflow-intel, tensorflow
Successfully installed absl-py-2.1.0 astunparse-1.6.3 flatbuffers-24.3.25 gast-0.5.4 google-pasta-0.2.0 grpcio-1.64.0 h5py-3.11.0 keras-3.3.3 libclang-18.1.1 markdown-3.6 markdown-it-py-3.0.0 mdurl-0.1.2 ml-dtypes-0.3.2 namex-0.0.8 opt-einsum-3.3.0 optree-0.11.0 protobuf-4.25.3 rich-13.7.1 tensorboard-2.16.2 tensorboard-data-server-0.7.2 tensorflow-2.16.1 tensorflow-intel-2.16.1 tensorflow-io-gcs-filesystem-0.31.0 termcolor-2.4.0 werkzeug-3.0.3 wheel-0.43.0 wrapt-1.16.0
!pip install tqdm
Collecting tqdm
Downloading tqdm-4.66.4-py3-none-any.whl.metadata (57 kB)
---------------------------------------- 0.0/57.6 kB ? eta -:--:--
------------- ------------------------ 20.5/57.6 kB 640.0 kB/s eta 0:00:01
-------------------------------------- 57.6/57.6 kB 606.6 kB/s eta 0:00:00
Requirement already satisfied: colorama in c:\users\engasser ophélie\desktop\cv\venv\lib\site-packages (from tqdm) (0.4.6)
Downloading tqdm-4.66.4-py3-none-any.whl (78 kB)
---------------------------------------- 0.0/78.3 kB ? eta -:--:--
---------------------------------------- 78.3/78.3 kB 2.2 MB/s eta 0:00:00
Installing collected packages: tqdm
Successfully installed tqdm-4.66.4
import cv2 as cv
from PIL import Image
import numpy as np
import pandas as pd
import tensorflow as tf
print(tf.__version__)
import os
import shutil
import matplotlib.pyplot as plt
import seaborn as sns
import json
import tqdm
from IPython.display import clear_output
from collections import Counter
%matplotlib inline
2.16.1
Détails des données du dataset
Dans le cadre du projet de détection et de reconnaissance, nous devrons exploiter un dataset qui se compose de plusieurs dossiers :
- Un dossier de villes différentes : Ce dossier est classé par villes, montre des images prises en voiture et montrant la route
- Un dossier de outputs : il contient un groupe d'images regroupées comme suit : (image, masque en couleurs qui montre toutes les classes reconnues, masque en gris qui regroupe certaines classes, chose qui réduit le nombre de classes détectées).
Pour plus d'ergonomie et dans le but d'une exploitation optimale des modèles et des ressources dont nous disposons, nous avons décidé de sélectionner seulement quelques villes pour l'entraînement et nous ajouterons graduellement de plus en plus de données pour obtenir de meilleurs résultats.
Les noms des images ont les particularités suivantes:
- leftmg8bits = images originales (utilisées comme input X).
- gtFine_labelIds = images annotées en niveaux de gris = masques où chaque pixel est mappé à son ID de classe (utilisées comme y). Ce sont les masques qui seront utilisés en association avec les images pour entraîner le modèle.
- gtFine_color = images annotées en couleur = masques d'annotations où chaque classe est représentée par une couleur (utilisées à des fins de visualisation, pas utilisées comme input).
- gtFine_instanceIds = ce sont les instances de chaque catégorie, à savoir les identifiants uniques attribués à chaque instance.
- gtFine_polygons = polygones stockés dans un json (ce sont les annotations originales avant transformation au format raster).
Output : masque de segmentation où chaque pixel est classé dans sa catégorie. Même dimension que l'image d'entrée, avec 8 masques de prédictions pour les prédictions brutes, et 1 pour les prédictions argmax (1 pixel = 1 classe ayant la plus forte probabilité).
Dans ce projet nous avons choisi un encodage des masques avec les labelsIds : chaque pixel = une classe, plutôt qu'un encodage en one-hot où chaque pixel serait représenté par un vecteur de taille 8.
Il est possible de visualiser le dataset sur le repo GitHub lié à Cityscapes. https://github.com/mcordts/cityscapesScripts
Ce repo fournit le schéma des labels (une trentaine) et les 8 sous-catégories sur lesquelles nous allons fonder les sorties du modèle.
from collections import namedtuple
# a label and all meta information
Label = namedtuple( 'Label' , [
'name' , # The identifier of this label, e.g. 'car', 'person', ... .
# We use them to uniquely name a class
'id' , # An integer ID that is associated with this label.
# The IDs are used to represent the label in ground truth images
# An ID of -1 means that this label does not have an ID and thus
# is ignored when creating ground truth images (e.g. license plate).
# Do not modify these IDs, since exactly these IDs are expected by the
# evaluation server.
'trainId' , # Feel free to modify these IDs as suitable for your method. Then create
# ground truth images with train IDs, using the tools provided in the
# 'preparation' folder. However, make sure to validate or submit results
# to our evaluation server using the regular IDs above!
# For trainIds, multiple labels might have the same ID. Then, these labels
# are mapped to the same class in the ground truth images. For the inverse
# mapping, we use the label that is defined first in the list below.
# For example, mapping all void-type classes to the same ID in training,
# might make sense for some approaches.
# Max value is 255!
'category' , # The name of the category that this label belongs to
'categoryId' , # The ID of this category. Used to create ground truth images
# on category level.
'hasInstances', # Whether this label distinguishes between single instances or not
'ignoreInEval', # Whether pixels having this class as ground truth label are ignored
# during evaluations or not
'color' , # The color of this label
] )
Label.category
_tuplegetter(3, 'Alias for field number 3')
Dans cette partie, nous donnons la particularité de chaque classe de départ (les 34), nom, ID, ID d'entraînement, ID de catégorie, et surtout le code de la couleur qui les représente sur les masques de couleurs
Sur les images colorées, ce sont les 34 labels qui sont représentés. Nous nous intéresserons dans ce projet à 8 classes uniquement.
labels = [
# name id trainId category catId hasInstances ignoreInEval color
Label( 'unlabeled' , 0 , 255 , 'void' , 0 , False , True , ( 0, 0, 0) ),
Label( 'ego vehicle' , 1 , 255 , 'void' , 0 , False , True , ( 0, 0, 0) ),
Label( 'rectification border' , 2 , 255 , 'void' , 0 , False , True , ( 0, 0, 0) ),
Label( 'out of roi' , 3 , 255 , 'void' , 0 , False , True , ( 0, 0, 0) ),
Label( 'static' , 4 , 255 , 'void' , 0 , False , True , ( 0, 0, 0) ),
Label( 'dynamic' , 5 , 255 , 'void' , 0 , False , True , (111, 74, 0) ),
Label( 'ground' , 6 , 255 , 'void' , 0 , False , True , ( 81, 0, 81) ),
Label( 'road' , 7 , 0 , 'flat' , 1 , False , False , (128, 64,128) ),
Label( 'sidewalk' , 8 , 1 , 'flat' , 1 , False , False , (244, 35,232) ),
Label( 'parking' , 9 , 255 , 'flat' , 1 , False , True , (250,170,160) ),
Label( 'rail track' , 10 , 255 , 'flat' , 1 , False , True , (230,150,140) ),
Label( 'building' , 11 , 2 , 'construction' , 2 , False , False , ( 70, 70, 70) ),
Label( 'wall' , 12 , 3 , 'construction' , 2 , False , False , (102,102,156) ),
Label( 'fence' , 13 , 4 , 'construction' , 2 , False , False , (190,153,153) ),
Label( 'guard rail' , 14 , 255 , 'construction' , 2 , False , True , (180,165,180) ),
Label( 'bridge' , 15 , 255 , 'construction' , 2 , False , True , (150,100,100) ),
Label( 'tunnel' , 16 , 255 , 'construction' , 2 , False , True , (150,120, 90) ),
Label( 'pole' , 17 , 5 , 'object' , 3 , False , False , (153,153,153) ),
Label( 'polegroup' , 18 , 255 , 'object' , 3 , False , True , (153,153,153) ),
Label( 'traffic light' , 19 , 6 , 'object' , 3 , False , False , (250,170, 30) ),
Label( 'traffic sign' , 20 , 7 , 'object' , 3 , False , False , (220,220, 0) ),
Label( 'vegetation' , 21 , 8 , 'nature' , 4 , False , False , (107,142, 35) ),
Label( 'terrain' , 22 , 9 , 'nature' , 4 , False , False , (152,251,152) ),
Label( 'sky' , 23 , 10 , 'sky' , 5 , False , False , ( 70,130,180) ),
Label( 'person' , 24 , 11 , 'human' , 6 , True , False , (220, 20, 60) ),
Label( 'rider' , 25 , 12 , 'human' , 6 , True , False , (255, 0, 0) ),
Label( 'car' , 26 , 13 , 'vehicle' , 7 , True , False , ( 0, 0,142) ),
Label( 'truck' , 27 , 14 , 'vehicle' , 7 , True , False , ( 0, 0, 70) ),
Label( 'bus' , 28 , 15 , 'vehicle' , 7 , True , False , ( 0, 60,100) ),
Label( 'caravan' , 29 , 255 , 'vehicle' , 7 , True , True , ( 0, 0, 90) ),
Label( 'trailer' , 30 , 255 , 'vehicle' , 7 , True , True , ( 0, 0,110) ),
Label( 'train' , 31 , 16 , 'vehicle' , 7 , True , False , ( 0, 80,100) ),
Label( 'motorcycle' , 32 , 17 , 'vehicle' , 7 , True , False , ( 0, 0,230) ),
Label( 'bicycle' , 33 , 18 , 'vehicle' , 7 , True , False , (119, 11, 32) ),
Label( 'license plate' , -1 , -1 , 'vehicle' , 7 , False , True , ( 0, 0,142) ),
]
Nous allons transformer les classes précedantes en les 8 classes que nous voulons .
# name to label object
name2label = { label.name : label for label in labels }
# id to label object
id2label = { label.id : label for label in labels }
# trainId to label object
id2category = { label[4] : label.category for label in labels }
trainId2label = { label.trainId : label for label in reversed(labels) }
# category to list of label objects
category2labels = {}
for label in labels:
category = label.category
if category in category2labels:
category2labels[category].append(label)
else:
category2labels[category] = [label]
trainId2label
{-1: Label(name='license plate', id=-1, trainId=-1, category='vehicle', categoryId=7, hasInstances=False, ignoreInEval=True, color=(0, 0, 142)),
18: Label(name='bicycle', id=33, trainId=18, category='vehicle', categoryId=7, hasInstances=True, ignoreInEval=False, color=(119, 11, 32)),
17: Label(name='motorcycle', id=32, trainId=17, category='vehicle', categoryId=7, hasInstances=True, ignoreInEval=False, color=(0, 0, 230)),
16: Label(name='train', id=31, trainId=16, category='vehicle', categoryId=7, hasInstances=True, ignoreInEval=False, color=(0, 80, 100)),
255: Label(name='unlabeled', id=0, trainId=255, category='void', categoryId=0, hasInstances=False, ignoreInEval=True, color=(0, 0, 0)),
15: Label(name='bus', id=28, trainId=15, category='vehicle', categoryId=7, hasInstances=True, ignoreInEval=False, color=(0, 60, 100)),
14: Label(name='truck', id=27, trainId=14, category='vehicle', categoryId=7, hasInstances=True, ignoreInEval=False, color=(0, 0, 70)),
13: Label(name='car', id=26, trainId=13, category='vehicle', categoryId=7, hasInstances=True, ignoreInEval=False, color=(0, 0, 142)),
12: Label(name='rider', id=25, trainId=12, category='human', categoryId=6, hasInstances=True, ignoreInEval=False, color=(255, 0, 0)),
11: Label(name='person', id=24, trainId=11, category='human', categoryId=6, hasInstances=True, ignoreInEval=False, color=(220, 20, 60)),
10: Label(name='sky', id=23, trainId=10, category='sky', categoryId=5, hasInstances=False, ignoreInEval=False, color=(70, 130, 180)),
9: Label(name='terrain', id=22, trainId=9, category='nature', categoryId=4, hasInstances=False, ignoreInEval=False, color=(152, 251, 152)),
8: Label(name='vegetation', id=21, trainId=8, category='nature', categoryId=4, hasInstances=False, ignoreInEval=False, color=(107, 142, 35)),
7: Label(name='traffic sign', id=20, trainId=7, category='object', categoryId=3, hasInstances=False, ignoreInEval=False, color=(220, 220, 0)),
6: Label(name='traffic light', id=19, trainId=6, category='object', categoryId=3, hasInstances=False, ignoreInEval=False, color=(250, 170, 30)),
5: Label(name='pole', id=17, trainId=5, category='object', categoryId=3, hasInstances=False, ignoreInEval=False, color=(153, 153, 153)),
4: Label(name='fence', id=13, trainId=4, category='construction', categoryId=2, hasInstances=False, ignoreInEval=False, color=(190, 153, 153)),
3: Label(name='wall', id=12, trainId=3, category='construction', categoryId=2, hasInstances=False, ignoreInEval=False, color=(102, 102, 156)),
2: Label(name='building', id=11, trainId=2, category='construction', categoryId=2, hasInstances=False, ignoreInEval=False, color=(70, 70, 70)),
1: Label(name='sidewalk', id=8, trainId=1, category='flat', categoryId=1, hasInstances=False, ignoreInEval=False, color=(244, 35, 232)),
0: Label(name='road', id=7, trainId=0, category='flat', categoryId=1, hasInstances=False, ignoreInEval=False, color=(128, 64, 128))}
id2category
{0: 'void',
1: 'flat',
2: 'construction',
3: 'object',
4: 'nature',
5: 'sky',
6: 'human',
7: 'vehicle'}
Import des données.
Pour plus d'ergonomie, nous avons sélectionné un échantillon de chaque dossier et nous les avons classés dans les dossiers X (données d'entrée avec les données brutes) et Y (données de sortie avec les masques).
path_X = "C:/Users/Engasser Ophélie/Desktop/DR_Project/data_X/X_train_stuttgart/stuttgart_000001_000019_leftImg8bit.png"
path_y = "C:/Users/Engasser Ophélie/Desktop/DR_Project/data_y/y_train_mask_stuttgart/"
image = Image.open(path_X)
image_color = Image.open(path_y + "stuttgart_000001_000019_gtFine_color.png")
print(image.mode, image_color.mode)
print(image.size, image_color.size)
RGB RGBA (2048, 1024) (2048, 1024)
plt.figure(figsize=(10, 10))
plt.subplot(2, 2, 1), plt.imshow(image)
plt.title('Image originale')
plt.axis('off')
plt.subplot(2, 2, 2), plt.imshow(image_color)
plt.title('Couleurs des labels')
plt.axis('off');
Sur les images colorées, ce sont les 34 labels qui sont représentés. Nous nous intéresserons dans ce projet à 8 classes uniquement. Il est essentiel d'analyser les données des villes différentes, afin d'assurer que la différence de villes n'influe pas sur la qualité des entrainnement .
# histogramme de l'image pour les 3 canaux (rouge, vert, bleu)
image_np = np.array(image)
color = ('b','g','r')
for i,col in enumerate(color):
histr = cv.calcHist([image_np],[i],None,[256],[0,256])
plt.plot(histr,color = col)
plt.xlim([0,256])
plt.title('Histogramme')
plt.xlabel('Valeurs des pixels')
plt.ylabel('Nombre de pixels')
plt.show()
L'histogramme nous permet de visualiser la distribution des valeurs de pixels pour chaque canal RBG. Nous voyons ici une répartition suivant le même pattern pour chaque canal, à savoir une majorité de valeurs de pixels dans la plage 25-140 environ, et une allure de la courbe bimodale (modes à 45 et 120 environ).
labels = Image.open(path_y + "stuttgart_000001_000019_gtFine_labelIds.png")
print(labels.mode)
print(labels.size)
L (2048, 1024)
Analyse des tailles.
On remarque que le masque présente les mêmes dimensions que l'image originale et est en niveaux de gris.
matrix = np.array(labels)
print(matrix.shape)
print(np.unique(matrix))
(1024, 2048) [ 1 3 4 5 7 8 11 13 17 19 20 21 23 24 26 27]
Il s'agit d'une matrice de labels, nous pouvons voir quels sont les labels représentés ici.
# affichage
plt.figure(figsize=(10, 10))
plt.subplot(2, 2, 1), plt.imshow(labels, cmap='gray')
plt.title('Masque en niveaux de gris')
plt.axis('off')
plt.subplot(2, 2, 2), plt.imshow(labels)
plt.title('Masque en couleurs')
plt.axis('off');
Cette partie consiste à appliquer les masques prédéfinis aux images brutes avant de les exploiter pour un modèle.
Démarche : nous n'avons pas sélectionné tout le dataset, pour des raisons de ressources. Nous avons sélectionné certaines villes pour constituer les datasets train, val et test, selon un ratio de 80:20 environ pour le train et le val, et environ 10% pour le dataset test.
def load_image(image_path, size=(128, 128)):
# Ouvrir l'image, redimensionner et normaliser
image = Image.open(image_path).resize(size)
return np.array(image) / 255.0 # Normalized RGB image
def load_mask(mask_path, size=(128, 128)):
# Ouvrir le masque et le redimensionner
mask = Image.open(mask_path).resize(size)
return np.array(mask) # Label IDs (no normalization)
train_images = []
train_masks = []
val_images = []
val_masks = []
test_images = []
test_masks = []
# chemins des répertoires
train_images_dir = "C:/Users/Engasser Ophélie/Desktop/DR_Project/data_X/X_train_stuttgart_tubingen_strasbourg_ulm_bremen_hamburg_zurich"
train_masks_dir = "C:/Users/Engasser Ophélie/Desktop/DR_Project/data_y/y_train_mask_stuttgart_tubingen_strasbourg_ulm_bremen_hamburg_zurich"
val_images_dir = "C:/Users/Engasser Ophélie/Desktop/DR_Project/data_X/X_val_frankfurt"
val_masks_dir = "C:/Users/Engasser Ophélie/Desktop/DR_Project/data_y/y_val_mask_frankfurt"
test_images_dir = "C:/Users/Engasser Ophélie/Desktop/DR_Project/data_X/X_test_jena"
test_masks_dir = "C:/Users/Engasser Ophélie/Desktop/DR_Project/data_y/y_test_mask_jena"
# chargement des images et des masques pour l'ensemble d'entraînement
for filename in os.listdir(train_images_dir):
if filename.endswith('_leftImg8bit.png'):
image_path = os.path.join(train_images_dir, filename)
mask_filename = filename.replace('_leftImg8bit.png', '_gtFine_labelIds.png')
mask_path = os.path.join(train_masks_dir, mask_filename)
if os.path.exists(mask_path):
image = load_image(image_path, size=(128, 128))
mask = load_mask(mask_path, size=(128, 128))
train_images.append(image)
train_masks.append(mask)
# chargement des images et des masques pour l'ensemble de validation
for filename in os.listdir(val_images_dir):
if filename.endswith('_leftImg8bit.png'):
image_path = os.path.join(val_images_dir, filename)
mask_filename = filename.replace('_leftImg8bit.png', '_gtFine_labelIds.png')
mask_path = os.path.join(val_masks_dir, mask_filename)
if os.path.exists(mask_path):
image = load_image(image_path, size=(128, 128))
mask = load_mask(mask_path, size=(128, 128))
val_images.append(image)
val_masks.append(mask)
# chargement des images et des masques pour l'ensemble de test
for filename in os.listdir(test_images_dir):
if filename.endswith('_leftImg8bit.png'):
image_path = os.path.join(test_images_dir, filename)
mask_filename = filename.replace('_leftImg8bit.png', '_gtFine_labelIds.png')
mask_path = os.path.join(test_masks_dir, mask_filename)
if os.path.exists(mask_path):
image = load_image(image_path, size=(128, 128))
mask = load_mask(mask_path, size=(128, 128))
test_images.append(image)
test_masks.append(mask)
# convertir les listes en tableaux numpy
train_images = np.array(train_images)
train_masks = np.array(train_masks)
val_images = np.array(val_images)
val_masks = np.array(val_masks)
test_images = np.array(test_images)
test_masks = np.array(test_masks)
print(train_images.shape, train_masks.shape, val_images.shape, val_masks.shape, test_images.shape, test_masks.shape)
(1486, 128, 128, 3) (1486, 128, 128) (267, 128, 128, 3) (267, 128, 128) (119, 128, 128, 3) (119, 128, 128)
train_masks[0]
array([[ 8, 10, 10, ..., 10, 10, 8],
[15, 21, 21, ..., 21, 21, 16],
[15, 21, 21, ..., 21, 21, 16],
...,
[ 6, 7, 7, ..., 7, 7, 6],
[ 6, 7, 7, ..., 7, 7, 6],
[ 4, 4, 4, ..., 5, 5, 4]], dtype=uint8)
np.unique(train_masks)
array([ 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16,
17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33],
dtype=uint8)
Renomer les labels. Les masques sont encodés avec les 34 labels. Il faut mapper les labels avec leurs catégories (8 en tout).
def remap_labels_to_categories(mask):
label_id_to_category_id = {
0: 0, # 'void'
1: 0, # 'void'
2: 0, # 'void'
3: 0, # 'void'
4: 0, # 'void'
5: 0, # 'void'
6: 0, # 'void'
7: 1, # 'flat'
8: 1, # 'flat'
9: 1, # 'flat'
10: 1, # 'flat'
11: 2, # 'construction'
12: 2, # 'construction'
13: 2, # 'construction'
14: 2, # 'construction'
15: 2, # 'construction'
16: 2, # 'construction'
17: 3, # 'object'
18: 3, # 'object'
19: 3, # 'object'
20: 3, # 'object'
21: 4, # 'nature'
22: 4, # 'nature'
23: 5, # 'sky'
24: 6, # 'human'
25: 6, # 'human'
26: 7, # 'vehicle'
27: 7, # 'vehicle'
28: 7, # 'vehicle'
29: 7, # 'vehicle'
30: 7, # 'vehicle'
31: 7, # 'vehicle'
32: 7, # 'vehicle'
33: 7, # 'vehicle'
-1: 7 # 'vehicle'
}
# exclure les valeurs du masque qui ne sont pas présentes dans le dictionnaire
mask[mask < 0] = 0
mask[mask > 33] = 0
# remapping de chaque valeur du masque à son identifiant de catégorie correspondant
remapped_mask = np.vectorize(label_id_to_category_id.get)(mask)
return remapped_mask
# remapping des masques dans train_masks, val_masks et test_masks
remapped_train_masks = np.array([remap_labels_to_categories(mask) for mask in train_masks])
remapped_val_masks = np.array([remap_labels_to_categories(mask) for mask in val_masks])
remapped_test_masks = np.array([remap_labels_to_categories(mask) for mask in test_masks])
np.unique(remapped_train_masks)
array([0, 1, 2, 3, 4, 5, 6, 7])
print(remapped_train_masks.shape, remapped_val_masks.shape, remapped_test_masks.shape)
(1486, 128, 128) (267, 128, 128) (119, 128, 128)
On remarque que la segmentation obtenue est satifaisante.
# train
plt.figure(figsize=(8, 6))
plt.subplot(1, 2, 1)
plt.title('Train Image')
plt.imshow(train_images[0])
plt.axis('off')
plt.subplot(1, 2, 2)
plt.title('Train Mask')
plt.imshow(remapped_train_masks[0])
plt.axis('off')
plt.show()
# val
plt.figure(figsize=(8, 6))
plt.subplot(1, 2, 1)
plt.title('Validation Image')
plt.imshow(val_images[0])
plt.axis('off')
plt.subplot(1, 2, 2)
plt.title('Validation Mask')
plt.imshow(remapped_val_masks[0])
plt.axis('off')
plt.show()
# test
plt.figure(figsize=(8, 6))
plt.subplot(1, 2, 1)
plt.title('Test Image')
plt.imshow(test_images[0])
plt.axis('off')
plt.subplot(1, 2, 2)
plt.title('Test Mask')
plt.imshow(remapped_test_masks[0])
plt.axis('off')
plt.show()
Le U-net est une architecture qui permet d'encoder une donnée (visuelle, sonore, numérique comme un vecteur ou un tuple) en un équivalent plus petit (réduction de dimmentionnalité).
# création d'une classe pour ajouter la métrique IoU dans le fit
from tensorflow.keras.metrics import MeanIoU
class UpdatedMeanIoU(MeanIoU):
def __init__(self,
y_true=None,
y_pred=None,
num_classes=None,
name=None,
dtype=None):
super(UpdatedMeanIoU, self).__init__(num_classes = num_classes,name=name, dtype=dtype)
def update_state(self, y_true, y_pred, sample_weight=None):
y_pred = tf.math.argmax(y_pred, axis=-1)
return super().update_state(y_true, y_pred, sample_weight)
import tensorflow as tf
from tensorflow.keras.layers import Input, Conv2D, MaxPooling2D, Conv2DTranspose, concatenate
from tensorflow.keras.models import Model
def unet(input_shape=(128, 128, 3)):
inputs = Input(input_shape)
# encoder
conv1 = Conv2D(64, 3, activation='relu', padding='same')(inputs)
conv1 = Conv2D(64, 3, activation='relu', padding='same')(conv1)
pool1 = MaxPooling2D(pool_size=(2, 2))(conv1)
conv2 = Conv2D(128, 3, activation='relu', padding='same')(pool1)
conv2 = Conv2D(128, 3, activation='relu', padding='same')(conv2)
pool2 = MaxPooling2D(pool_size=(2, 2))(conv2)
# bridge
conv3 = Conv2D(256, 3, activation='relu', padding='same')(pool2)
conv3 = Conv2D(256, 3, activation='relu', padding='same')(conv3)
# decoder
up4 = Conv2DTranspose(128, (2, 2), strides=(2, 2), padding='same')(conv3)
up4 = concatenate([conv2, up4], axis=3)
conv4 = Conv2D(128, 3, activation='relu', padding='same')(up4)
conv4 = Conv2D(128, 3, activation='relu', padding='same')(conv4)
up5 = Conv2DTranspose(64, (2, 2), strides=(2, 2), padding='same')(conv4)
up5 = concatenate([conv1, up5], axis=3)
conv5 = Conv2D(64, 3, activation='relu', padding='same')(up5)
conv5 = Conv2D(64, 3, activation='relu', padding='same')(conv5)
# output layer
outputs = Conv2D(8, 1, activation='softmax')(conv5) # 8 classes de sortie
# créer le modèle
model = Model(inputs=inputs, outputs=outputs)
return model
# créer une instance du modèle U-Net
model = unet()
# compiler le modèle
model.compile(optimizer='adam',
loss=tf.keras.losses.SparseCategoricalCrossentropy(),
metrics=['accuracy', UpdatedMeanIoU(num_classes=8, name = "mean_iou")])
Il est possible d'avoir plus de détail sur le model avec la fonction suivante
model.summary()
Model: "functional_1"
┏━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━┓ ┃ Layer (type) ┃ Output Shape ┃ Param # ┃ Connected to ┃ ┡━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━┩ │ input_layer │ (None, 128, 128, │ 0 │ - │ │ (InputLayer) │ 3) │ │ │ ├─────────────────────┼───────────────────┼────────────┼───────────────────┤ │ conv2d (Conv2D) │ (None, 128, 128, │ 1,792 │ input_layer[0][0] │ │ │ 64) │ │ │ ├─────────────────────┼───────────────────┼────────────┼───────────────────┤ │ conv2d_1 (Conv2D) │ (None, 128, 128, │ 36,928 │ conv2d[0][0] │ │ │ 64) │ │ │ ├─────────────────────┼───────────────────┼────────────┼───────────────────┤ │ max_pooling2d │ (None, 64, 64, │ 0 │ conv2d_1[0][0] │ │ (MaxPooling2D) │ 64) │ │ │ ├─────────────────────┼───────────────────┼────────────┼───────────────────┤ │ conv2d_2 (Conv2D) │ (None, 64, 64, │ 73,856 │ max_pooling2d[0]… │ │ │ 128) │ │ │ ├─────────────────────┼───────────────────┼────────────┼───────────────────┤ │ conv2d_3 (Conv2D) │ (None, 64, 64, │ 147,584 │ conv2d_2[0][0] │ │ │ 128) │ │ │ ├─────────────────────┼───────────────────┼────────────┼───────────────────┤ │ max_pooling2d_1 │ (None, 32, 32, │ 0 │ conv2d_3[0][0] │ │ (MaxPooling2D) │ 128) │ │ │ ├─────────────────────┼───────────────────┼────────────┼───────────────────┤ │ conv2d_4 (Conv2D) │ (None, 32, 32, │ 295,168 │ max_pooling2d_1[… │ │ │ 256) │ │ │ ├─────────────────────┼───────────────────┼────────────┼───────────────────┤ │ conv2d_5 (Conv2D) │ (None, 32, 32, │ 590,080 │ conv2d_4[0][0] │ │ │ 256) │ │ │ ├─────────────────────┼───────────────────┼────────────┼───────────────────┤ │ conv2d_transpose │ (None, 64, 64, │ 131,200 │ conv2d_5[0][0] │ │ (Conv2DTranspose) │ 128) │ │ │ ├─────────────────────┼───────────────────┼────────────┼───────────────────┤ │ concatenate │ (None, 64, 64, │ 0 │ conv2d_3[0][0], │ │ (Concatenate) │ 256) │ │ conv2d_transpose… │ ├─────────────────────┼───────────────────┼────────────┼───────────────────┤ │ conv2d_6 (Conv2D) │ (None, 64, 64, │ 295,040 │ concatenate[0][0] │ │ │ 128) │ │ │ ├─────────────────────┼───────────────────┼────────────┼───────────────────┤ │ conv2d_7 (Conv2D) │ (None, 64, 64, │ 147,584 │ conv2d_6[0][0] │ │ │ 128) │ │ │ ├─────────────────────┼───────────────────┼────────────┼───────────────────┤ │ conv2d_transpose_1 │ (None, 128, 128, │ 32,832 │ conv2d_7[0][0] │ │ (Conv2DTranspose) │ 64) │ │ │ ├─────────────────────┼───────────────────┼────────────┼───────────────────┤ │ concatenate_1 │ (None, 128, 128, │ 0 │ conv2d_1[0][0], │ │ (Concatenate) │ 128) │ │ conv2d_transpose… │ ├─────────────────────┼───────────────────┼────────────┼───────────────────┤ │ conv2d_8 (Conv2D) │ (None, 128, 128, │ 73,792 │ concatenate_1[0]… │ │ │ 64) │ │ │ ├─────────────────────┼───────────────────┼────────────┼───────────────────┤ │ conv2d_9 (Conv2D) │ (None, 128, 128, │ 36,928 │ conv2d_8[0][0] │ │ │ 64) │ │ │ ├─────────────────────┼───────────────────┼────────────┼───────────────────┤ │ conv2d_10 (Conv2D) │ (None, 128, 128, │ 520 │ conv2d_9[0][0] │ │ │ 8) │ │ │ └─────────────────────┴───────────────────┴────────────┴───────────────────┘
Total params: 1,863,304 (7.11 MB)
Trainable params: 1,863,304 (7.11 MB)
Non-trainable params: 0 (0.00 B)
# paramètres
TRAIN_LENGTH = len(train_images)
BATCH_SIZE = 64
BUFFER_SIZE = 1000 # nb d'éléments à garder en mémoire tampon lors du shuffle
STEPS_PER_EPOCH = TRAIN_LENGTH // BATCH_SIZE
EPOCHS = 100
VAL_SUBSPLITS = 5
VALIDATION_STEPS = len(val_images) // BATCH_SIZE // VAL_SUBSPLITS
# préparation des datasets
train_dataset = tf.data.Dataset.from_tensor_slices((train_images, remapped_train_masks))
train_dataset = train_dataset.cache().shuffle(BUFFER_SIZE).batch(BATCH_SIZE).repeat()
val_dataset = tf.data.Dataset.from_tensor_slices((val_images, remapped_val_masks))
val_dataset = val_dataset.batch(BATCH_SIZE)
from tensorflow.keras.callbacks import EarlyStopping
model_history = model.fit(train_dataset, epochs=EPOCHS,
steps_per_epoch=STEPS_PER_EPOCH,
validation_steps=VALIDATION_STEPS,
validation_data=val_dataset,
callbacks=[EarlyStopping(monitor='val_loss', patience=5, restore_best_weights=True)])
Epoch 1/100 23/23 ━━━━━━━━━━━━━━━━━━━━ 360s 15s/step - accuracy: 0.3856 - loss: 1.6930 - mean_iou: 0.0918 - val_accuracy: 0.4704 - val_loss: 1.4609 - val_mean_iou: 0.1446 Epoch 2/100 23/23 ━━━━━━━━━━━━━━━━━━━━ 294s 13s/step - accuracy: 0.5387 - loss: 1.2937 - mean_iou: 0.2300 - val_accuracy: 0.5644 - val_loss: 1.1870 - val_mean_iou: 0.2606 Epoch 3/100 23/23 ━━━━━━━━━━━━━━━━━━━━ 315s 14s/step - accuracy: 0.6210 - loss: 1.0707 - mean_iou: 0.2971 - val_accuracy: 0.6019 - val_loss: 1.1072 - val_mean_iou: 0.2743 Epoch 4/100 23/23 ━━━━━━━━━━━━━━━━━━━━ 317s 14s/step - accuracy: 0.6479 - loss: 1.0147 - mean_iou: 0.3160 - val_accuracy: 0.6639 - val_loss: 0.9969 - val_mean_iou: 0.3218 Epoch 5/100 23/23 ━━━━━━━━━━━━━━━━━━━━ 316s 14s/step - accuracy: 0.6881 - loss: 0.9231 - mean_iou: 0.3589 - val_accuracy: 0.6957 - val_loss: 0.9077 - val_mean_iou: 0.3740 Epoch 6/100 23/23 ━━━━━━━━━━━━━━━━━━━━ 323s 14s/step - accuracy: 0.7170 - loss: 0.8525 - mean_iou: 0.3977 - val_accuracy: 0.7066 - val_loss: 0.8818 - val_mean_iou: 0.3806 Epoch 7/100 23/23 ━━━━━━━━━━━━━━━━━━━━ 321s 14s/step - accuracy: 0.7333 - loss: 0.8142 - mean_iou: 0.4162 - val_accuracy: 0.7227 - val_loss: 0.8413 - val_mean_iou: 0.3939 Epoch 8/100 23/23 ━━━━━━━━━━━━━━━━━━━━ 345s 15s/step - accuracy: 0.7476 - loss: 0.7824 - mean_iou: 0.4288 - val_accuracy: 0.7390 - val_loss: 0.8009 - val_mean_iou: 0.4150 Epoch 9/100 23/23 ━━━━━━━━━━━━━━━━━━━━ 323s 14s/step - accuracy: 0.7620 - loss: 0.7390 - mean_iou: 0.4440 - val_accuracy: 0.7418 - val_loss: 0.7945 - val_mean_iou: 0.4185 Epoch 10/100 23/23 ━━━━━━━━━━━━━━━━━━━━ 316s 14s/step - accuracy: 0.7704 - loss: 0.7212 - mean_iou: 0.4528 - val_accuracy: 0.7590 - val_loss: 0.7528 - val_mean_iou: 0.4250 Epoch 11/100 23/23 ━━━━━━━━━━━━━━━━━━━━ 322s 14s/step - accuracy: 0.7780 - loss: 0.6923 - mean_iou: 0.4595 - val_accuracy: 0.7726 - val_loss: 0.7129 - val_mean_iou: 0.4528 Epoch 12/100 23/23 ━━━━━━━━━━━━━━━━━━━━ 327s 14s/step - accuracy: 0.7855 - loss: 0.6779 - mean_iou: 0.4701 - val_accuracy: 0.7460 - val_loss: 0.7755 - val_mean_iou: 0.4344 Epoch 13/100 23/23 ━━━━━━━━━━━━━━━━━━━━ 322s 14s/step - accuracy: 0.7920 - loss: 0.6605 - mean_iou: 0.4790 - val_accuracy: 0.7735 - val_loss: 0.7062 - val_mean_iou: 0.4509 Epoch 14/100 23/23 ━━━━━━━━━━━━━━━━━━━━ 315s 14s/step - accuracy: 0.7996 - loss: 0.6386 - mean_iou: 0.4871 - val_accuracy: 0.7763 - val_loss: 0.6898 - val_mean_iou: 0.4545 Epoch 15/100 23/23 ━━━━━━━━━━━━━━━━━━━━ 333s 14s/step - accuracy: 0.8068 - loss: 0.6166 - mean_iou: 0.4945 - val_accuracy: 0.7771 - val_loss: 0.6812 - val_mean_iou: 0.4600 Epoch 16/100 23/23 ━━━━━━━━━━━━━━━━━━━━ 328s 14s/step - accuracy: 0.8061 - loss: 0.6169 - mean_iou: 0.4960 - val_accuracy: 0.7911 - val_loss: 0.6575 - val_mean_iou: 0.4748 Epoch 17/100 23/23 ━━━━━━━━━━━━━━━━━━━━ 323s 14s/step - accuracy: 0.8117 - loss: 0.6030 - mean_iou: 0.4986 - val_accuracy: 0.7872 - val_loss: 0.6643 - val_mean_iou: 0.4799 Epoch 18/100 23/23 ━━━━━━━━━━━━━━━━━━━━ 323s 14s/step - accuracy: 0.8166 - loss: 0.5858 - mean_iou: 0.5093 - val_accuracy: 0.7692 - val_loss: 0.7107 - val_mean_iou: 0.4643 Epoch 19/100 23/23 ━━━━━━━━━━━━━━━━━━━━ 323s 14s/step - accuracy: 0.8098 - loss: 0.6059 - mean_iou: 0.5033 - val_accuracy: 0.7996 - val_loss: 0.6263 - val_mean_iou: 0.4885 Epoch 20/100 23/23 ━━━━━━━━━━━━━━━━━━━━ 319s 14s/step - accuracy: 0.8232 - loss: 0.5696 - mean_iou: 0.5201 - val_accuracy: 0.7953 - val_loss: 0.6327 - val_mean_iou: 0.4905 Epoch 21/100 23/23 ━━━━━━━━━━━━━━━━━━━━ 322s 14s/step - accuracy: 0.8259 - loss: 0.5576 - mean_iou: 0.5268 - val_accuracy: 0.7981 - val_loss: 0.6284 - val_mean_iou: 0.5000 Epoch 22/100 23/23 ━━━━━━━━━━━━━━━━━━━━ 310s 13s/step - accuracy: 0.8262 - loss: 0.5581 - mean_iou: 0.5317 - val_accuracy: 0.8004 - val_loss: 0.6196 - val_mean_iou: 0.4945 Epoch 23/100 23/23 ━━━━━━━━━━━━━━━━━━━━ 317s 14s/step - accuracy: 0.8299 - loss: 0.5463 - mean_iou: 0.5384 - val_accuracy: 0.7986 - val_loss: 0.6302 - val_mean_iou: 0.5032 Epoch 24/100 23/23 ━━━━━━━━━━━━━━━━━━━━ 313s 14s/step - accuracy: 0.8346 - loss: 0.5335 - mean_iou: 0.5460 - val_accuracy: 0.8114 - val_loss: 0.5935 - val_mean_iou: 0.5274 Epoch 25/100 23/23 ━━━━━━━━━━━━━━━━━━━━ 324s 14s/step - accuracy: 0.8390 - loss: 0.5178 - mean_iou: 0.5571 - val_accuracy: 0.8022 - val_loss: 0.6207 - val_mean_iou: 0.5172 Epoch 26/100 23/23 ━━━━━━━━━━━━━━━━━━━━ 321s 14s/step - accuracy: 0.8362 - loss: 0.5274 - mean_iou: 0.5531 - val_accuracy: 0.8108 - val_loss: 0.6027 - val_mean_iou: 0.5255 Epoch 27/100 23/23 ━━━━━━━━━━━━━━━━━━━━ 336s 14s/step - accuracy: 0.8417 - loss: 0.5094 - mean_iou: 0.5675 - val_accuracy: 0.8110 - val_loss: 0.5941 - val_mean_iou: 0.5272 Epoch 28/100 23/23 ━━━━━━━━━━━━━━━━━━━━ 309s 13s/step - accuracy: 0.8411 - loss: 0.5115 - mean_iou: 0.5635 - val_accuracy: 0.8156 - val_loss: 0.5801 - val_mean_iou: 0.5370 Epoch 29/100 23/23 ━━━━━━━━━━━━━━━━━━━━ 317s 14s/step - accuracy: 0.8527 - loss: 0.4742 - mean_iou: 0.5813 - val_accuracy: 0.8072 - val_loss: 0.6009 - val_mean_iou: 0.5310 Epoch 30/100 23/23 ━━━━━━━━━━━━━━━━━━━━ 317s 14s/step - accuracy: 0.8438 - loss: 0.5027 - mean_iou: 0.5741 - val_accuracy: 0.8101 - val_loss: 0.5978 - val_mean_iou: 0.5297 Epoch 31/100 23/23 ━━━━━━━━━━━━━━━━━━━━ 330s 14s/step - accuracy: 0.8476 - loss: 0.4895 - mean_iou: 0.5779 - val_accuracy: 0.8174 - val_loss: 0.5719 - val_mean_iou: 0.5397 Epoch 32/100 23/23 ━━━━━━━━━━━━━━━━━━━━ 311s 13s/step - accuracy: 0.8494 - loss: 0.4845 - mean_iou: 0.5793 - val_accuracy: 0.8139 - val_loss: 0.5813 - val_mean_iou: 0.5312 Epoch 33/100 23/23 ━━━━━━━━━━━━━━━━━━━━ 313s 13s/step - accuracy: 0.8538 - loss: 0.4716 - mean_iou: 0.5859 - val_accuracy: 0.8146 - val_loss: 0.5894 - val_mean_iou: 0.5378 Epoch 34/100 23/23 ━━━━━━━━━━━━━━━━━━━━ 317s 14s/step - accuracy: 0.8555 - loss: 0.4642 - mean_iou: 0.5937 - val_accuracy: 0.8236 - val_loss: 0.5579 - val_mean_iou: 0.5431 Epoch 35/100 23/23 ━━━━━━━━━━━━━━━━━━━━ 324s 14s/step - accuracy: 0.8576 - loss: 0.4578 - mean_iou: 0.5973 - val_accuracy: 0.8209 - val_loss: 0.5679 - val_mean_iou: 0.5489 Epoch 36/100 23/23 ━━━━━━━━━━━━━━━━━━━━ 316s 14s/step - accuracy: 0.8604 - loss: 0.4465 - mean_iou: 0.6014 - val_accuracy: 0.8252 - val_loss: 0.5539 - val_mean_iou: 0.5490 Epoch 37/100 23/23 ━━━━━━━━━━━━━━━━━━━━ 325s 14s/step - accuracy: 0.8570 - loss: 0.4579 - mean_iou: 0.5977 - val_accuracy: 0.8289 - val_loss: 0.5443 - val_mean_iou: 0.5544 Epoch 38/100 23/23 ━━━━━━━━━━━━━━━━━━━━ 316s 14s/step - accuracy: 0.8600 - loss: 0.4496 - mean_iou: 0.6037 - val_accuracy: 0.8257 - val_loss: 0.5526 - val_mean_iou: 0.5535 Epoch 39/100 23/23 ━━━━━━━━━━━━━━━━━━━━ 315s 14s/step - accuracy: 0.8610 - loss: 0.4449 - mean_iou: 0.6036 - val_accuracy: 0.8304 - val_loss: 0.5367 - val_mean_iou: 0.5637 Epoch 40/100 23/23 ━━━━━━━━━━━━━━━━━━━━ 306s 13s/step - accuracy: 0.8638 - loss: 0.4352 - mean_iou: 0.6118 - val_accuracy: 0.8286 - val_loss: 0.5471 - val_mean_iou: 0.5599 Epoch 41/100 23/23 ━━━━━━━━━━━━━━━━━━━━ 311s 13s/step - accuracy: 0.8638 - loss: 0.4325 - mean_iou: 0.6101 - val_accuracy: 0.8297 - val_loss: 0.5420 - val_mean_iou: 0.5635 Epoch 42/100 23/23 ━━━━━━━━━━━━━━━━━━━━ 316s 14s/step - accuracy: 0.8705 - loss: 0.4131 - mean_iou: 0.6222 - val_accuracy: 0.8317 - val_loss: 0.5357 - val_mean_iou: 0.5686 Epoch 43/100 23/23 ━━━━━━━━━━━━━━━━━━━━ 317s 14s/step - accuracy: 0.8664 - loss: 0.4249 - mean_iou: 0.6174 - val_accuracy: 0.8276 - val_loss: 0.5400 - val_mean_iou: 0.5607 Epoch 44/100 23/23 ━━━━━━━━━━━━━━━━━━━━ 325s 14s/step - accuracy: 0.8640 - loss: 0.4295 - mean_iou: 0.6139 - val_accuracy: 0.8294 - val_loss: 0.5452 - val_mean_iou: 0.5608 Epoch 45/100 23/23 ━━━━━━━━━━━━━━━━━━━━ 314s 14s/step - accuracy: 0.8686 - loss: 0.4162 - mean_iou: 0.6197 - val_accuracy: 0.8296 - val_loss: 0.5451 - val_mean_iou: 0.5645 Epoch 46/100 23/23 ━━━━━━━━━━━━━━━━━━━━ 328s 14s/step - accuracy: 0.8682 - loss: 0.4179 - mean_iou: 0.6213 - val_accuracy: 0.8230 - val_loss: 0.5722 - val_mean_iou: 0.5525 Epoch 47/100 23/23 ━━━━━━━━━━━━━━━━━━━━ 312s 14s/step - accuracy: 0.8647 - loss: 0.4274 - mean_iou: 0.6180 - val_accuracy: 0.8248 - val_loss: 0.5608 - val_mean_iou: 0.5563
plt.figure(figsize=(15,6))
plt.subplot(1,3,1)
plt.plot(model_history.history['val_loss'])
plt.plot(model_history.history['loss'])
plt.title("Fitting history: LOSS")
plt.ylabel('Loss')
plt.xlabel('Epoch')
plt.legend(['Validation', 'Train'], loc = 'upper right')
plt.subplot(1,3,2)
plt.plot(model_history.history['val_accuracy'])
plt.plot(model_history.history['accuracy'])
plt.title("Fitting history: ACCURACY")
plt.ylabel('Accuracy')
plt.xlabel('Epoch')
plt.legend(['Validation', 'Train'], loc = 'upper left')
plt.subplot(1,3,3)
plt.plot(model_history.history['val_mean_iou'])
plt.plot(model_history.history['mean_iou'])
plt.title("Fitting history: MEAN IOU")
plt.ylabel('Mean IoU')
plt.xlabel('Epoch')
plt.legend(['Validation', 'Train'], loc = 'upper left')
plt.show()
Les courbes mettent en évidence un apprentissage régulier avec une bonne stabilité, avec toutefois un surapprentissage.
from tensorflow.keras.models import load_model
# sauvegarde
model.save('model.h5')
# chargement
model = load_model('model.h5')
# prédictions
pred_masks = model.predict(test_images)
# convertir les prédictions en masques de classe (chaque pixel prend la valeur de la classe avec la probabilité la plus élevée)
pred_masks = np.argmax(pred_masks, axis=-1)
4/4 ━━━━━━━━━━━━━━━━━━━━ 7s 2s/step
np.unique(pred_masks)
array([0, 1, 2, 3, 4, 5, 6, 7], dtype=int64)
Les 8 classes sont bien représentées.
# visu
num_examples = 3
for i in range(num_examples):
plt.figure(figsize=(15, 5))
plt.subplot(1, 3, 1)
plt.imshow(test_images[i])
plt.title('Input Image')
plt.axis('off')
plt.subplot(1, 3, 2)
plt.imshow(remapped_test_masks[i])
plt.title('True Mask')
plt.axis('off')
plt.subplot(1, 3, 3)
plt.imshow(pred_masks[i])
plt.title('Predicted Mask')
plt.axis('off')
plt.show()
Nous nous intéressons à la métrique IoU (Intersection over Union) : l'intersection représente le nombre de pixels communs entre les prédictions et la vérité-terrain, et l'union représente le chevauchement des deux. $$IoU = \frac{Aire\ intersection}{Aire\ union}$$ Il s'agit d'une métrique pertinente dans la tâche de segmentation sémantique dans le sens où elle nous apporte une observation du nombre de pixels bien classés, en d'autres termes la précision spatiale des masques de prédictions.
def iou_metric(y_true, y_pred, num_classes):
ious = []
for cls in range(num_classes):
intersection = np.logical_and(y_true == cls, y_pred == cls)
union = np.logical_or(y_true == cls, y_pred == cls)
if np.sum(union) == 0:
iou_score = float('nan') # évite la division par zéro
else:
iou_score = np.sum(intersection) / np.sum(union)
ious.append(iou_score)
return ious
# IoU pour chaque classe
num_classes = 8
ious = [] # ious pour chaque classe pour chaque paire de masques
for true_mask, pred_mask in zip(remapped_test_masks, pred_masks):
ious.append(iou_metric(true_mask, pred_mask, num_classes))
# IoU moyen
mean_iou_per_class = np.mean(ious, axis=0)
mean_iou = np.mean(mean_iou_per_class)
mean_iou.round(2)
0.52
mean_iou_per_class.round(2)
array([0.66, 0.85, 0.69, 0.11, 0.61, 0.58, 0.16, 0.5 ])
# affichage
class_names = ['void', 'flat', 'construction', 'object', 'nature', 'sky', 'human', 'vehicle']
fig, ax = plt.subplots(figsize=(10, 6))
bars = ax.bar(class_names, mean_iou_per_class, color='lightblue')
ax.set_title('Mean IoU per class')
ax.set_xlabel('Class')
ax.set_ylabel('Mean IoU')
ax.set_ylim(0, 1)
ax.grid(axis='y', linestyle='--', color='lightgrey')
for bar in bars:
height = bar.get_height()
ax.annotate(f'{height:.2f}',
xy=(bar.get_x() + bar.get_width() / 2, height),
xytext=(0, 3),
textcoords="offset points",
ha='center', va='bottom')
plt.tight_layout()
plt.show()
On observe que les classes présentant le moins d'items bien classés sont les objets et les humains. Si l'on observe les masques prédits, on voit effectivement que les piétons sont assimilés à des voitures (ce qui n'est pas un résultat à améliorer ++ si l'on souhaite appliquer le modèle à la conduite autonome). Ce résultat s'explique par le fait que les objets et piétons sont les polygones les plus petits sur les images, donc ceux qui présentent le moins de pixels susceptibles d'être bien appris par le modèle. Par ailleurs, si l'on observe les images, on voit que les piétons sont assez faciles à distinguer par l'œil humain qui a une conscience de leur forme, mais en termes de contraste de pixels, ils peuvent être difficiles à différencier et à extraire pour un modèle.
Concernant les pixels des classes les mieux prédites, on voit que ce sont des régions vastes des photos, le modèle apprend donc sur beaucoup plus de pixels et est donc meilleur pour les détecter.
!pip install scikit-learn
Collecting scikit-learn Downloading scikit_learn-1.5.0-cp311-cp311-win_amd64.whl.metadata (11 kB) Requirement already satisfied: numpy>=1.19.5 in c:\users\engasser ophélie\desktop\cv\venv\lib\site-packages (from scikit-learn) (1.26.4) Requirement already satisfied: scipy>=1.6.0 in c:\users\engasser ophélie\desktop\cv\venv\lib\site-packages (from scikit-learn) (1.12.0) Collecting joblib>=1.2.0 (from scikit-learn) Downloading joblib-1.4.2-py3-none-any.whl.metadata (5.4 kB) Collecting threadpoolctl>=3.1.0 (from scikit-learn) Downloading threadpoolctl-3.5.0-py3-none-any.whl.metadata (13 kB) Downloading scikit_learn-1.5.0-cp311-cp311-win_amd64.whl (11.0 MB) ---------------------------------------- 0.0/11.0 MB ? eta -:--:-- ---------------------------------------- 0.1/11.0 MB 2.6 MB/s eta 0:00:05 - -------------------------------------- 0.4/11.0 MB 3.8 MB/s eta 0:00:03 -- ------------------------------------- 0.6/11.0 MB 4.5 MB/s eta 0:00:03 --- ------------------------------------ 1.0/11.0 MB 5.3 MB/s eta 0:00:02 ---- ----------------------------------- 1.3/11.0 MB 5.6 MB/s eta 0:00:02 ------ --------------------------------- 1.6/11.0 MB 5.8 MB/s eta 0:00:02 ------- -------------------------------- 2.0/11.0 MB 6.3 MB/s eta 0:00:02 -------- ------------------------------- 2.4/11.0 MB 6.5 MB/s eta 0:00:02 --------- ------------------------------ 2.7/11.0 MB 6.6 MB/s eta 0:00:02 ----------- ---------------------------- 3.1/11.0 MB 6.9 MB/s eta 0:00:02 ------------ --------------------------- 3.5/11.0 MB 7.0 MB/s eta 0:00:02 ------------- -------------------------- 3.8/11.0 MB 7.0 MB/s eta 0:00:02 --------------- ------------------------ 4.2/11.0 MB 7.1 MB/s eta 0:00:01 ---------------- ----------------------- 4.5/11.0 MB 7.1 MB/s eta 0:00:01 ----------------- ---------------------- 4.8/11.0 MB 7.0 MB/s eta 0:00:01 ------------------- -------------------- 5.2/11.0 MB 7.1 MB/s eta 0:00:01 -------------------- ------------------- 5.5/11.0 MB 7.0 MB/s eta 0:00:01 --------------------- ------------------ 5.9/11.0 MB 7.1 MB/s eta 0:00:01 ---------------------- ----------------- 6.3/11.0 MB 7.2 MB/s eta 0:00:01 ------------------------ --------------- 6.7/11.0 MB 7.3 MB/s eta 0:00:01 -------------------------- ------------- 7.1/11.0 MB 7.4 MB/s eta 0:00:01 --------------------------- ------------ 7.6/11.0 MB 7.4 MB/s eta 0:00:01 ----------------------------- ---------- 8.0/11.0 MB 7.5 MB/s eta 0:00:01 ------------------------------ --------- 8.4/11.0 MB 7.6 MB/s eta 0:00:01 ------------------------------- -------- 8.8/11.0 MB 7.7 MB/s eta 0:00:01 --------------------------------- ------ 9.1/11.0 MB 7.6 MB/s eta 0:00:01 --------------------------------- ------ 9.3/11.0 MB 7.5 MB/s eta 0:00:01 --------------------------------- ------ 9.3/11.0 MB 7.5 MB/s eta 0:00:01 ------------------------------------ --- 9.9/11.0 MB 7.4 MB/s eta 0:00:01 -------------------------------------- - 10.5/11.0 MB 7.8 MB/s eta 0:00:01 --------------------------------------- 10.9/11.0 MB 8.0 MB/s eta 0:00:01 --------------------------------------- 11.0/11.0 MB 7.9 MB/s eta 0:00:01 --------------------------------------- 11.0/11.0 MB 7.9 MB/s eta 0:00:01 ---------------------------------------- 11.0/11.0 MB 7.3 MB/s eta 0:00:00 Downloading joblib-1.4.2-py3-none-any.whl (301 kB) ---------------------------------------- 0.0/301.8 kB ? eta -:--:-- ------------------------------- -------- 235.5/301.8 kB 7.3 MB/s eta 0:00:01 ---------------------------------------- 301.8/301.8 kB 4.7 MB/s eta 0:00:00 Downloading threadpoolctl-3.5.0-py3-none-any.whl (18 kB) Installing collected packages: threadpoolctl, joblib, scikit-learn Successfully installed joblib-1.4.2 scikit-learn-1.5.0 threadpoolctl-3.5.0
from sklearn.metrics import confusion_matrix
# aplatir les masques pour obtenir une liste d'étiquettes de pixels
y_true = remapped_test_masks.flatten()
y_pred = pred_masks.flatten()
conf_matrix = confusion_matrix(y_true, y_pred)
plt.figure(figsize=(10, 8))
sns.heatmap(conf_matrix, annot=True, fmt='d', cmap='Blues')
plt.xlabel('Predicted Labels')
plt.ylabel('True Labels')
plt.title('Confusion Matrix')
plt.show()
from sklearn.metrics import classification_report
class_report = classification_report(y_true, y_pred)
print("Classification Report:\n")
print(class_report)
Classification Report:
precision recall f1-score support
0 0.92 0.54 0.68 197494
1 0.89 0.95 0.92 739743
2 0.82 0.86 0.84 511198
3 0.45 0.13 0.20 53572
4 0.79 0.89 0.84 273454
5 0.85 0.86 0.86 46727
6 0.52 0.37 0.43 24347
7 0.74 0.85 0.79 103161
accuracy 0.84 1949696
macro avg 0.75 0.68 0.70 1949696
weighted avg 0.84 0.84 0.83 1949696
precision = [.92, .89, .82, .45, .79, .85, .52, .74]
recall = [.54, .95, .86, .13, .89, .86, .37, .85]
f1_score = [.68, .92, .84, .20, .84, .86, .43, .79]
bar_width = 0.25
r1 = np.arange(len(precision))
r2 = [x + bar_width for x in r1]
r3 = [x + bar_width for x in r2]
plt.figure(figsize=(12, 6))
plt.bar(r1, precision, color='lightblue', width=bar_width, edgecolor='grey', label='Precision')
plt.bar(r2, recall, color='steelblue', width=bar_width, edgecolor='grey', label='Recall')
plt.bar(r3, f1_score, color='darkblue', width=bar_width, edgecolor='grey', label='F1 Score')
plt.xlabel('Class', fontweight='bold')
plt.ylabel('Score', fontweight='bold')
plt.title('Performance Metrics per Class')
plt.xticks([r + bar_width for r in range(len(precision))], class_names)
for i in range(len(precision)):
plt.text(r1[i], precision[i] + 0.02, f'{precision[i]:.2f}', ha='center', va='bottom')
plt.text(r2[i], recall[i] + 0.02, f'{recall[i]:.2f}', ha='center', va='bottom')
plt.text(r3[i], f1_score[i] + 0.02, f'{f1_score[i]:.2f}', ha='center', va='bottom')
plt.ylim(0, 1)
plt.grid(axis='y', linestyle='--', color='lightgrey')
plt.legend()
plt.tight_layout()
plt.show()
On observe les mêmes tendances concernant les métriques précision, rappel et f1-score que pour l'IoU, à savoir de moindres performances sur les classes objets et humains. Les métriques sont relativement cohérentes à l'intérieur d'une même classe, à part pour 'void' qui présente une bonne précision mais un faible rappel (idem pour 'object'). Par ex. pour les objets, cela signifie que lorsque le modèle prédit un objet alors cela s'avère à 45% juste (prédiction), mais il peine à les détecter (13% de rappel). Comme nous l'avons mentionné, cela s'explique par une sous-représentation de ces classes dans le dataset.
Nous allons utiliser la librairie optuna afin d'optimiser les hyperparamètres de manière automatisée. Puis nous utiliserons l'objet de sortie d'optuna pour réentraîner le modèle avec les meilleurs hyperparamètres. Nous avons choisi de faire varier le nombre filtres pour chaque couche de neurones, ainsi que le learning rate.
!pip install optuna
!pip install optuna-integration
Collecting optuna Downloading optuna-3.6.1-py3-none-any.whl.metadata (17 kB) Collecting alembic>=1.5.0 (from optuna) Downloading alembic-1.13.1-py3-none-any.whl.metadata (7.4 kB) Collecting colorlog (from optuna) Downloading colorlog-6.8.2-py3-none-any.whl.metadata (10 kB) Requirement already satisfied: numpy in c:\users\engasser ophélie\desktop\cv\venv\lib\site-packages (from optuna) (1.26.4) Requirement already satisfied: packaging>=20.0 in c:\users\engasser ophélie\desktop\cv\venv\lib\site-packages (from optuna) (23.2) Collecting sqlalchemy>=1.3.0 (from optuna) Downloading SQLAlchemy-2.0.30-cp311-cp311-win_amd64.whl.metadata (9.8 kB) Requirement already satisfied: tqdm in c:\users\engasser ophélie\desktop\cv\venv\lib\site-packages (from optuna) (4.66.4) Requirement already satisfied: PyYAML in c:\users\engasser ophélie\desktop\cv\venv\lib\site-packages (from optuna) (6.0.1) Collecting Mako (from alembic>=1.5.0->optuna) Downloading Mako-1.3.5-py3-none-any.whl.metadata (2.9 kB) Requirement already satisfied: typing-extensions>=4 in c:\users\engasser ophélie\desktop\cv\venv\lib\site-packages (from alembic>=1.5.0->optuna) (4.10.0) Collecting greenlet!=0.4.17 (from sqlalchemy>=1.3.0->optuna) Downloading greenlet-3.0.3-cp311-cp311-win_amd64.whl.metadata (3.9 kB) Requirement already satisfied: colorama in c:\users\engasser ophélie\desktop\cv\venv\lib\site-packages (from colorlog->optuna) (0.4.6) Requirement already satisfied: MarkupSafe>=0.9.2 in c:\users\engasser ophélie\desktop\cv\venv\lib\site-packages (from Mako->alembic>=1.5.0->optuna) (2.1.5) Downloading optuna-3.6.1-py3-none-any.whl (380 kB) ---------------------------------------- 0.0/380.1 kB ? eta -:--:-- ------- -------------------------------- 71.7/380.1 kB 1.9 MB/s eta 0:00:01 ------------ --------------------------- 122.9/380.1 kB 1.8 MB/s eta 0:00:01 -------------------------- ------------- 256.0/380.1 kB 2.0 MB/s eta 0:00:01 ------------------------------------- -- 358.4/380.1 kB 2.0 MB/s eta 0:00:01 ---------------------------------------- 380.1/380.1 kB 2.0 MB/s eta 0:00:00 Downloading alembic-1.13.1-py3-none-any.whl (233 kB) ---------------------------------------- 0.0/233.4 kB ? eta -:--:-- --------------- ------------------------ 92.2/233.4 kB 1.7 MB/s eta 0:00:01 ------------------- -------------------- 112.6/233.4 kB 1.3 MB/s eta 0:00:01 ---------------------------------------- 233.4/233.4 kB 1.8 MB/s eta 0:00:00 Downloading SQLAlchemy-2.0.30-cp311-cp311-win_amd64.whl (2.1 MB) ---------------------------------------- 0.0/2.1 MB ? eta -:--:-- - -------------------------------------- 0.1/2.1 MB 3.8 MB/s eta 0:00:01 --- ------------------------------------ 0.2/2.1 MB 2.6 MB/s eta 0:00:01 ----- ---------------------------------- 0.3/2.1 MB 2.3 MB/s eta 0:00:01 ------- -------------------------------- 0.4/2.1 MB 2.3 MB/s eta 0:00:01 --------- ------------------------------ 0.5/2.1 MB 2.3 MB/s eta 0:00:01 ---------- ----------------------------- 0.6/2.1 MB 2.2 MB/s eta 0:00:01 ------------ --------------------------- 0.7/2.1 MB 2.2 MB/s eta 0:00:01 -------------- ------------------------- 0.8/2.1 MB 2.2 MB/s eta 0:00:01 ---------------- ----------------------- 0.9/2.1 MB 2.2 MB/s eta 0:00:01 ------------------ --------------------- 1.0/2.1 MB 2.2 MB/s eta 0:00:01 -------------------- ------------------- 1.1/2.1 MB 2.2 MB/s eta 0:00:01 ---------------------- ----------------- 1.2/2.1 MB 2.2 MB/s eta 0:00:01 ------------------------ --------------- 1.3/2.1 MB 2.2 MB/s eta 0:00:01 -------------------------- ------------- 1.4/2.1 MB 2.2 MB/s eta 0:00:01 ---------------------------- ----------- 1.5/2.1 MB 2.2 MB/s eta 0:00:01 ------------------------------ --------- 1.6/2.1 MB 2.2 MB/s eta 0:00:01 ------------------------------- -------- 1.7/2.1 MB 2.2 MB/s eta 0:00:01 --------------------------------- ------ 1.8/2.1 MB 2.2 MB/s eta 0:00:01 ----------------------------------- ---- 1.9/2.1 MB 2.2 MB/s eta 0:00:01 ------------------------------------- -- 2.0/2.1 MB 2.2 MB/s eta 0:00:01 --------------------------------------- 2.1/2.1 MB 2.2 MB/s eta 0:00:01 ---------------------------------------- 2.1/2.1 MB 2.1 MB/s eta 0:00:00 Downloading colorlog-6.8.2-py3-none-any.whl (11 kB) Downloading greenlet-3.0.3-cp311-cp311-win_amd64.whl (292 kB) ---------------------------------------- 0.0/292.8 kB ? eta -:--:-- --------- ------------------------------ 71.7/292.8 kB 2.0 MB/s eta 0:00:01 ----------------------- ---------------- 174.1/292.8 kB 2.1 MB/s eta 0:00:01 ------------------------------------- -- 276.5/292.8 kB 2.1 MB/s eta 0:00:01 ---------------------------------------- 292.8/292.8 kB 2.0 MB/s eta 0:00:00 Downloading Mako-1.3.5-py3-none-any.whl (78 kB) ---------------------------------------- 0.0/78.6 kB ? eta -:--:-- ------------------------------------ --- 71.7/78.6 kB 1.9 MB/s eta 0:00:01 ---------------------------------------- 78.6/78.6 kB 1.1 MB/s eta 0:00:00 Installing collected packages: Mako, greenlet, colorlog, sqlalchemy, alembic, optuna Successfully installed Mako-1.3.5 alembic-1.13.1 colorlog-6.8.2 greenlet-3.0.3 optuna-3.6.1 sqlalchemy-2.0.30 Collecting optuna-integration Downloading optuna_integration-3.6.0-py3-none-any.whl.metadata (10 kB) Requirement already satisfied: optuna in c:\users\engasser ophélie\desktop\cv\venv\lib\site-packages (from optuna-integration) (3.6.1) Requirement already satisfied: alembic>=1.5.0 in c:\users\engasser ophélie\desktop\cv\venv\lib\site-packages (from optuna->optuna-integration) (1.13.1) Requirement already satisfied: colorlog in c:\users\engasser ophélie\desktop\cv\venv\lib\site-packages (from optuna->optuna-integration) (6.8.2) Requirement already satisfied: numpy in c:\users\engasser ophélie\desktop\cv\venv\lib\site-packages (from optuna->optuna-integration) (1.26.4) Requirement already satisfied: packaging>=20.0 in c:\users\engasser ophélie\desktop\cv\venv\lib\site-packages (from optuna->optuna-integration) (23.2) Requirement already satisfied: sqlalchemy>=1.3.0 in c:\users\engasser ophélie\desktop\cv\venv\lib\site-packages (from optuna->optuna-integration) (2.0.30) Requirement already satisfied: tqdm in c:\users\engasser ophélie\desktop\cv\venv\lib\site-packages (from optuna->optuna-integration) (4.66.4) Requirement already satisfied: PyYAML in c:\users\engasser ophélie\desktop\cv\venv\lib\site-packages (from optuna->optuna-integration) (6.0.1) Requirement already satisfied: Mako in c:\users\engasser ophélie\desktop\cv\venv\lib\site-packages (from alembic>=1.5.0->optuna->optuna-integration) (1.3.5) Requirement already satisfied: typing-extensions>=4 in c:\users\engasser ophélie\desktop\cv\venv\lib\site-packages (from alembic>=1.5.0->optuna->optuna-integration) (4.10.0) Requirement already satisfied: greenlet!=0.4.17 in c:\users\engasser ophélie\desktop\cv\venv\lib\site-packages (from sqlalchemy>=1.3.0->optuna->optuna-integration) (3.0.3) Requirement already satisfied: colorama in c:\users\engasser ophélie\desktop\cv\venv\lib\site-packages (from colorlog->optuna->optuna-integration) (0.4.6) Requirement already satisfied: MarkupSafe>=0.9.2 in c:\users\engasser ophélie\desktop\cv\venv\lib\site-packages (from Mako->alembic>=1.5.0->optuna->optuna-integration) (2.1.5) Downloading optuna_integration-3.6.0-py3-none-any.whl (93 kB) ---------------------------------------- 0.0/93.4 kB ? eta -:--:-- -------- ------------------------------- 20.5/93.4 kB 640.0 kB/s eta 0:00:01 ----------------------------------- ---- 81.9/93.4 kB 1.1 MB/s eta 0:00:01 ---------------------------------------- 93.4/93.4 kB 1.1 MB/s eta 0:00:00 Installing collected packages: optuna-integration Successfully installed optuna-integration-3.6.0
import optuna
from optuna.integration import TFKerasPruningCallback
import tensorflow as tf
from tensorflow.keras.layers import Input, Conv2D, MaxPooling2D, Conv2DTranspose, concatenate
from tensorflow.keras.models import Model
def unet_model(trial):
inputs = Input(shape=(128, 128, 3))
# hyperparameters to tune MODIFY HERE to add or remove hyperparameters optuna will search for the best value
n_filters = trial.suggest_categorical('n_filters', [32, 64])
learning_rate = trial.suggest_loguniform('learning_rate', 1e-5, 1e-3)
# encoder
conv1 = Conv2D(n_filters, 3, activation='relu', padding='same')(inputs)
conv1 = Conv2D(n_filters, 3, activation='relu', padding='same')(conv1)
pool1 = MaxPooling2D(pool_size=(2, 2))(conv1)
conv2 = Conv2D(n_filters * 2, 3, activation='relu', padding='same')(pool1)
conv2 = Conv2D(n_filters * 2, 3, activation='relu', padding='same')(conv2)
pool2 = MaxPooling2D(pool_size=(2, 2))(conv2)
# bridge
conv3 = Conv2D(n_filters * 4, 3, activation='relu', padding='same')(pool2)
conv3 = Conv2D(n_filters * 4, 3, activation='relu', padding='same')(conv3)
# decoder
up4 = Conv2DTranspose(n_filters * 2, (2, 2), strides=(2, 2), padding='same')(conv3)
up4 = concatenate([conv2, up4], axis=3)
conv4 = Conv2D(n_filters * 2, 3, activation='relu', padding='same')(up4)
conv4 = Conv2D(n_filters * 2, 3, activation='relu', padding='same')(conv4)
up5 = Conv2DTranspose(n_filters, (2, 2), strides=(2, 2), padding='same')(conv4)
up5 = concatenate([conv1, up5], axis=3)
conv5 = Conv2D(n_filters, 3, activation='relu', padding='same')(up5)
conv5 = Conv2D(n_filters, 3, activation='relu', padding='same')(conv5)
# output layer
outputs = Conv2D(8, 1, activation='softmax')(conv5)
model = Model(inputs=[inputs], outputs=[outputs])
model.compile(optimizer=tf.keras.optimizers.Adam(learning_rate),
loss=tf.keras.losses.SparseCategoricalCrossentropy(),
metrics=['accuracy', UpdatedMeanIoU(num_classes=8)])
return model
def objective(trial):
model = unet_model(trial)
TRAIN_LENGTH = len(train_images)
BATCH_SIZE = 64
BUFFER_SIZE = 1000
STEPS_PER_EPOCH = TRAIN_LENGTH // BATCH_SIZE
EPOCHS = 100
VAL_SUBSPLITS = 5
VALIDATION_STEPS = len(val_images) // BATCH_SIZE // VAL_SUBSPLITS
# préparation des datasets
train_dataset = tf.data.Dataset.from_tensor_slices((train_images, remapped_train_masks))
train_dataset = train_dataset.cache().shuffle(BUFFER_SIZE).batch(BATCH_SIZE).repeat()
val_dataset = tf.data.Dataset.from_tensor_slices((val_images, remapped_val_masks))
val_dataset = val_dataset.batch(BATCH_SIZE)
history = model.fit(train_dataset,
epochs=EPOCHS,
steps_per_epoch=STEPS_PER_EPOCH,
validation_data=val_dataset,
validation_steps=VALIDATION_STEPS,
callbacks=[TFKerasPruningCallback(trial, 'val_loss')],
verbose=0)
return min(history.history['val_loss'])
study = optuna.create_study(direction='minimize')
study.optimize(objective, n_trials=1) # MODIFY HERE n_trials to increase the number of trials the optuna will run to find the best hyperparameters
[I 2024-06-05 09:13:46,221] A new study created in memory with name: no-name-976d0bc8-da94-4151-8588-4a8dc8d08c64
C:\Users\Engasser Ophélie\AppData\Local\Temp\ipykernel_39152\3779512948.py:12: FutureWarning: suggest_loguniform has been deprecated in v3.0.0. This feature will be removed in v6.0.0. See https://github.com/optuna/optuna/releases/tag/v3.0.0. Use suggest_float(..., log=True) instead.
learning_rate = trial.suggest_loguniform('learning_rate', 1e-5, 1e-3)
[I 2024-06-05 19:59:41,910] Trial 0 finished with value: 0.5540214776992798 and parameters: {'n_filters': 64, 'learning_rate': 0.00027647990450630345}. Best is trial 0 with value: 0.5540214776992798.
A présent il est possible d'entraîner un nouveau modèle avec les meilleurs hyperparamètres.
from tensorflow.keras.callbacks import EarlyStopping
best_params = study.best_params
best_params
def unet_model(best_params):
inputs = Input(shape=(128, 128, 3))
# hyperparameters to tune MODIFY HERE to add or remove hyperparameters optuna will search for the best value
n_filters = best_params['n_filters']
# encoder
conv1 = Conv2D(n_filters, 3, activation='relu', padding='same')(inputs)
conv1 = Conv2D(n_filters, 3, activation='relu', padding='same')(conv1)
pool1 = MaxPooling2D(pool_size=(2, 2))(conv1)
conv2 = Conv2D(n_filters * 2, 3, activation='relu', padding='same')(pool1)
conv2 = Conv2D(n_filters * 2, 3, activation='relu', padding='same')(conv2)
pool2 = MaxPooling2D(pool_size=(2, 2))(conv2)
# bridge
conv3 = Conv2D(n_filters * 4, 3, activation='relu', padding='same')(pool2)
conv3 = Conv2D(n_filters * 4, 3, activation='relu', padding='same')(conv3)
# decoder
up4 = Conv2DTranspose(n_filters * 2, (2, 2), strides=(2, 2), padding='same')(conv3)
up4 = concatenate([conv2, up4], axis=3)
conv4 = Conv2D(n_filters * 2, 3, activation='relu', padding='same')(up4)
conv4 = Conv2D(n_filters * 2, 3, activation='relu', padding='same')(conv4)
up5 = Conv2DTranspose(n_filters, (2, 2), strides=(2, 2), padding='same')(conv4)
up5 = concatenate([conv1, up5], axis=3)
conv5 = Conv2D(n_filters, 3, activation='relu', padding='same')(up5)
conv5 = Conv2D(n_filters, 3, activation='relu', padding='same')(conv5)
# output layer
outputs = Conv2D(8, 1, activation='softmax')(conv5)
model = Model(inputs=[inputs], outputs=[outputs])
return model
model = unet_model(best_params)
TRAIN_LENGTH = len(train_images)
BATCH_SIZE = 64
BUFFER_SIZE = 1000
STEPS_PER_EPOCH = TRAIN_LENGTH // BATCH_SIZE
EPOCHS = 100
model.compile(optimizer=tf.keras.optimizers.Adam(best_params['learning_rate']),
loss=tf.keras.losses.SparseCategoricalCrossentropy(),
metrics=['accuracy', UpdatedMeanIoU(num_classes=8, name = "mean_iou")])
model_history = model.fit(train_dataset, epochs=EPOCHS,
steps_per_epoch=STEPS_PER_EPOCH,
validation_steps=VALIDATION_STEPS,
validation_data=val_dataset,
callbacks=[EarlyStopping(monitor='val_loss', patience=5, restore_best_weights=True)])
Epoch 1/100 23/23 ━━━━━━━━━━━━━━━━━━━━ 362s 14s/step - accuracy: 0.3388 - loss: 1.9724 - mean_iou: 0.0575 - val_accuracy: 0.3924 - val_loss: 1.7324 - val_mean_iou: 0.0491 Epoch 2/100 23/23 ━━━━━━━━━━━━━━━━━━━━ 323s 15s/step - accuracy: 0.3887 - loss: 1.6892 - mean_iou: 0.0501 - val_accuracy: 0.4852 - val_loss: 1.4667 - val_mean_iou: 0.1056 Epoch 3/100 23/23 ━━━━━━━━━━━━━━━━━━━━ 339s 15s/step - accuracy: 0.4986 - loss: 1.4482 - mean_iou: 0.1186 - val_accuracy: 0.5406 - val_loss: 1.3217 - val_mean_iou: 0.1659 Epoch 4/100 23/23 ━━━━━━━━━━━━━━━━━━━━ 342s 15s/step - accuracy: 0.5664 - loss: 1.2370 - mean_iou: 0.1912 - val_accuracy: 0.5960 - val_loss: 1.1735 - val_mean_iou: 0.2541 Epoch 5/100 23/23 ━━━━━━━━━━━━━━━━━━━━ 335s 14s/step - accuracy: 0.6435 - loss: 1.0643 - mean_iou: 0.3036 - val_accuracy: 0.6325 - val_loss: 1.0842 - val_mean_iou: 0.2890 Epoch 6/100 23/23 ━━━━━━━━━━━━━━━━━━━━ 350s 15s/step - accuracy: 0.6602 - loss: 1.0020 - mean_iou: 0.3230 - val_accuracy: 0.6475 - val_loss: 1.0339 - val_mean_iou: 0.3062 Epoch 7/100 23/23 ━━━━━━━━━━━━━━━━━━━━ 324s 14s/step - accuracy: 0.6835 - loss: 0.9364 - mean_iou: 0.3420 - val_accuracy: 0.6759 - val_loss: 0.9685 - val_mean_iou: 0.3213 Epoch 8/100 23/23 ━━━━━━━━━━━━━━━━━━━━ 327s 14s/step - accuracy: 0.6964 - loss: 0.9046 - mean_iou: 0.3607 - val_accuracy: 0.6999 - val_loss: 0.9186 - val_mean_iou: 0.3660 Epoch 9/100 23/23 ━━━━━━━━━━━━━━━━━━━━ 329s 14s/step - accuracy: 0.7222 - loss: 0.8489 - mean_iou: 0.3964 - val_accuracy: 0.7108 - val_loss: 0.9014 - val_mean_iou: 0.3811 Epoch 10/100 23/23 ━━━━━━━━━━━━━━━━━━━━ 342s 15s/step - accuracy: 0.7193 - loss: 0.8585 - mean_iou: 0.3983 - val_accuracy: 0.7205 - val_loss: 0.8665 - val_mean_iou: 0.3910 Epoch 11/100 23/23 ━━━━━━━━━━━━━━━━━━━━ 339s 15s/step - accuracy: 0.7391 - loss: 0.8051 - mean_iou: 0.4184 - val_accuracy: 0.7089 - val_loss: 0.8741 - val_mean_iou: 0.3875 Epoch 12/100 23/23 ━━━━━━━━━━━━━━━━━━━━ 327s 14s/step - accuracy: 0.7466 - loss: 0.7810 - mean_iou: 0.4311 - val_accuracy: 0.7326 - val_loss: 0.8385 - val_mean_iou: 0.4087 Epoch 13/100 23/23 ━━━━━━━━━━━━━━━━━━━━ 333s 14s/step - accuracy: 0.7481 - loss: 0.7813 - mean_iou: 0.4298 - val_accuracy: 0.7220 - val_loss: 0.8540 - val_mean_iou: 0.3987 Epoch 14/100 23/23 ━━━━━━━━━━━━━━━━━━━━ 355s 15s/step - accuracy: 0.7607 - loss: 0.7500 - mean_iou: 0.4412 - val_accuracy: 0.7448 - val_loss: 0.8016 - val_mean_iou: 0.4163 Epoch 15/100 23/23 ━━━━━━━━━━━━━━━━━━━━ 339s 15s/step - accuracy: 0.7500 - loss: 0.7748 - mean_iou: 0.4304 - val_accuracy: 0.7326 - val_loss: 0.8210 - val_mean_iou: 0.4054 Epoch 16/100 23/23 ━━━━━━━━━━━━━━━━━━━━ 355s 16s/step - accuracy: 0.7534 - loss: 0.7684 - mean_iou: 0.4355 - val_accuracy: 0.7289 - val_loss: 0.8128 - val_mean_iou: 0.4042 Epoch 17/100 23/23 ━━━━━━━━━━━━━━━━━━━━ 356s 15s/step - accuracy: 0.7718 - loss: 0.7130 - mean_iou: 0.4529 - val_accuracy: 0.7346 - val_loss: 0.8013 - val_mean_iou: 0.4193 Epoch 18/100 23/23 ━━━━━━━━━━━━━━━━━━━━ 343s 15s/step - accuracy: 0.7792 - loss: 0.6963 - mean_iou: 0.4628 - val_accuracy: 0.7499 - val_loss: 0.7659 - val_mean_iou: 0.4290 Epoch 19/100 23/23 ━━━━━━━━━━━━━━━━━━━━ 351s 15s/step - accuracy: 0.7783 - loss: 0.6984 - mean_iou: 0.4634 - val_accuracy: 0.7221 - val_loss: 0.8300 - val_mean_iou: 0.4045 Epoch 20/100 23/23 ━━━━━━━━━━━━━━━━━━━━ 355s 15s/step - accuracy: 0.7752 - loss: 0.7015 - mean_iou: 0.4611 - val_accuracy: 0.7413 - val_loss: 0.7862 - val_mean_iou: 0.4231 Epoch 21/100 23/23 ━━━━━━━━━━━━━━━━━━━━ 352s 15s/step - accuracy: 0.7795 - loss: 0.6951 - mean_iou: 0.4642 - val_accuracy: 0.7559 - val_loss: 0.7511 - val_mean_iou: 0.4286 Epoch 22/100 23/23 ━━━━━━━━━━━━━━━━━━━━ 373s 16s/step - accuracy: 0.7850 - loss: 0.6788 - mean_iou: 0.4708 - val_accuracy: 0.7508 - val_loss: 0.7553 - val_mean_iou: 0.4342 Epoch 23/100 23/23 ━━━━━━━━━━━━━━━━━━━━ 343s 15s/step - accuracy: 0.7920 - loss: 0.6575 - mean_iou: 0.4767 - val_accuracy: 0.7602 - val_loss: 0.7336 - val_mean_iou: 0.4378 Epoch 24/100 23/23 ━━━━━━━━━━━━━━━━━━━━ 357s 15s/step - accuracy: 0.7940 - loss: 0.6539 - mean_iou: 0.4811 - val_accuracy: 0.7483 - val_loss: 0.7583 - val_mean_iou: 0.4276 Epoch 25/100 23/23 ━━━━━━━━━━━━━━━━━━━━ 359s 16s/step - accuracy: 0.7930 - loss: 0.6528 - mean_iou: 0.4799 - val_accuracy: 0.7705 - val_loss: 0.7028 - val_mean_iou: 0.4537 Epoch 26/100 23/23 ━━━━━━━━━━━━━━━━━━━━ 343s 15s/step - accuracy: 0.8005 - loss: 0.6400 - mean_iou: 0.4905 - val_accuracy: 0.7618 - val_loss: 0.7243 - val_mean_iou: 0.4430 Epoch 27/100 23/23 ━━━━━━━━━━━━━━━━━━━━ 347s 15s/step - accuracy: 0.8011 - loss: 0.6310 - mean_iou: 0.4915 - val_accuracy: 0.7751 - val_loss: 0.6858 - val_mean_iou: 0.4591 Epoch 28/100 23/23 ━━━━━━━━━━━━━━━━━━━━ 356s 16s/step - accuracy: 0.8048 - loss: 0.6240 - mean_iou: 0.4930 - val_accuracy: 0.7786 - val_loss: 0.6804 - val_mean_iou: 0.4624 Epoch 29/100 23/23 ━━━━━━━━━━━━━━━━━━━━ 338s 15s/step - accuracy: 0.8083 - loss: 0.6147 - mean_iou: 0.4982 - val_accuracy: 0.7770 - val_loss: 0.6779 - val_mean_iou: 0.4597 Epoch 30/100 23/23 ━━━━━━━━━━━━━━━━━━━━ 337s 15s/step - accuracy: 0.8122 - loss: 0.6012 - mean_iou: 0.5008 - val_accuracy: 0.7726 - val_loss: 0.6905 - val_mean_iou: 0.4536 Epoch 31/100 23/23 ━━━━━━━━━━━━━━━━━━━━ 365s 16s/step - accuracy: 0.8113 - loss: 0.6038 - mean_iou: 0.5004 - val_accuracy: 0.7718 - val_loss: 0.7030 - val_mean_iou: 0.4621 Epoch 32/100 23/23 ━━━━━━━━━━━━━━━━━━━━ 355s 15s/step - accuracy: 0.8113 - loss: 0.6055 - mean_iou: 0.5036 - val_accuracy: 0.7736 - val_loss: 0.6902 - val_mean_iou: 0.4612 Epoch 33/100 23/23 ━━━━━━━━━━━━━━━━━━━━ 357s 15s/step - accuracy: 0.8155 - loss: 0.5889 - mean_iou: 0.5081 - val_accuracy: 0.7758 - val_loss: 0.6816 - val_mean_iou: 0.4614 Epoch 34/100 23/23 ━━━━━━━━━━━━━━━━━━━━ 360s 15s/step - accuracy: 0.8171 - loss: 0.5901 - mean_iou: 0.5114 - val_accuracy: 0.7802 - val_loss: 0.6742 - val_mean_iou: 0.4696 Epoch 35/100 23/23 ━━━━━━━━━━━━━━━━━━━━ 357s 16s/step - accuracy: 0.8098 - loss: 0.6073 - mean_iou: 0.5021 - val_accuracy: 0.7874 - val_loss: 0.6605 - val_mean_iou: 0.4715 Epoch 36/100 23/23 ━━━━━━━━━━━━━━━━━━━━ 351s 15s/step - accuracy: 0.8174 - loss: 0.5874 - mean_iou: 0.5097 - val_accuracy: 0.7887 - val_loss: 0.6482 - val_mean_iou: 0.4813 Epoch 37/100 23/23 ━━━━━━━━━━━━━━━━━━━━ 338s 15s/step - accuracy: 0.8195 - loss: 0.5749 - mean_iou: 0.5147 - val_accuracy: 0.7812 - val_loss: 0.6739 - val_mean_iou: 0.4724 Epoch 38/100 23/23 ━━━━━━━━━━━━━━━━━━━━ 343s 15s/step - accuracy: 0.8201 - loss: 0.5788 - mean_iou: 0.5149 - val_accuracy: 0.7906 - val_loss: 0.6457 - val_mean_iou: 0.4850 Epoch 39/100 23/23 ━━━━━━━━━━━━━━━━━━━━ 351s 15s/step - accuracy: 0.8239 - loss: 0.5652 - mean_iou: 0.5212 - val_accuracy: 0.7909 - val_loss: 0.6424 - val_mean_iou: 0.4818 Epoch 40/100 23/23 ━━━━━━━━━━━━━━━━━━━━ 342s 15s/step - accuracy: 0.8214 - loss: 0.5741 - mean_iou: 0.5189 - val_accuracy: 0.7888 - val_loss: 0.6486 - val_mean_iou: 0.4869 Epoch 41/100 23/23 ━━━━━━━━━━━━━━━━━━━━ 340s 15s/step - accuracy: 0.8270 - loss: 0.5577 - mean_iou: 0.5267 - val_accuracy: 0.7850 - val_loss: 0.6633 - val_mean_iou: 0.4848 Epoch 42/100 23/23 ━━━━━━━━━━━━━━━━━━━━ 345s 15s/step - accuracy: 0.8174 - loss: 0.5833 - mean_iou: 0.5161 - val_accuracy: 0.7982 - val_loss: 0.6238 - val_mean_iou: 0.4946 Epoch 43/100 23/23 ━━━━━━━━━━━━━━━━━━━━ 349s 15s/step - accuracy: 0.8212 - loss: 0.5704 - mean_iou: 0.5222 - val_accuracy: 0.7900 - val_loss: 0.6502 - val_mean_iou: 0.4929 Epoch 44/100 23/23 ━━━━━━━━━━━━━━━━━━━━ 354s 15s/step - accuracy: 0.8273 - loss: 0.5580 - mean_iou: 0.5272 - val_accuracy: 0.7796 - val_loss: 0.6762 - val_mean_iou: 0.4747 Epoch 45/100 23/23 ━━━━━━━━━━━━━━━━━━━━ 349s 15s/step - accuracy: 0.8280 - loss: 0.5540 - mean_iou: 0.5324 - val_accuracy: 0.7908 - val_loss: 0.6435 - val_mean_iou: 0.4902 Epoch 46/100 23/23 ━━━━━━━━━━━━━━━━━━━━ 347s 15s/step - accuracy: 0.8328 - loss: 0.5390 - mean_iou: 0.5388 - val_accuracy: 0.7734 - val_loss: 0.6956 - val_mean_iou: 0.4763 Epoch 47/100 23/23 ━━━━━━━━━━━━━━━━━━━━ 351s 15s/step - accuracy: 0.8256 - loss: 0.5562 - mean_iou: 0.5346 - val_accuracy: 0.8012 - val_loss: 0.6147 - val_mean_iou: 0.5102 Epoch 48/100 23/23 ━━━━━━━━━━━━━━━━━━━━ 344s 15s/step - accuracy: 0.8314 - loss: 0.5391 - mean_iou: 0.5441 - val_accuracy: 0.7951 - val_loss: 0.6350 - val_mean_iou: 0.4911 Epoch 49/100 23/23 ━━━━━━━━━━━━━━━━━━━━ 351s 15s/step - accuracy: 0.8224 - loss: 0.5681 - mean_iou: 0.5249 - val_accuracy: 0.7976 - val_loss: 0.6236 - val_mean_iou: 0.5016 Epoch 50/100 23/23 ━━━━━━━━━━━━━━━━━━━━ 344s 15s/step - accuracy: 0.8287 - loss: 0.5512 - mean_iou: 0.5394 - val_accuracy: 0.7973 - val_loss: 0.6266 - val_mean_iou: 0.4971 Epoch 51/100 23/23 ━━━━━━━━━━━━━━━━━━━━ 345s 15s/step - accuracy: 0.8355 - loss: 0.5280 - mean_iou: 0.5486 - val_accuracy: 0.8095 - val_loss: 0.5902 - val_mean_iou: 0.5195 Epoch 52/100 23/23 ━━━━━━━━━━━━━━━━━━━━ 350s 15s/step - accuracy: 0.8369 - loss: 0.5243 - mean_iou: 0.5474 - val_accuracy: 0.8045 - val_loss: 0.6042 - val_mean_iou: 0.5155 Epoch 53/100 23/23 ━━━━━━━━━━━━━━━━━━━━ 349s 15s/step - accuracy: 0.8379 - loss: 0.5189 - mean_iou: 0.5603 - val_accuracy: 0.8051 - val_loss: 0.6045 - val_mean_iou: 0.5156 Epoch 54/100 23/23 ━━━━━━━━━━━━━━━━━━━━ 341s 15s/step - accuracy: 0.8365 - loss: 0.5259 - mean_iou: 0.5503 - val_accuracy: 0.7937 - val_loss: 0.6391 - val_mean_iou: 0.5020 Epoch 55/100 23/23 ━━━━━━━━━━━━━━━━━━━━ 346s 15s/step - accuracy: 0.8365 - loss: 0.5239 - mean_iou: 0.5589 - val_accuracy: 0.7990 - val_loss: 0.6260 - val_mean_iou: 0.5087 Epoch 56/100 23/23 ━━━━━━━━━━━━━━━━━━━━ 340s 15s/step - accuracy: 0.8426 - loss: 0.5113 - mean_iou: 0.5591 - val_accuracy: 0.7999 - val_loss: 0.6169 - val_mean_iou: 0.5123
plt.figure(figsize=(15,6))
plt.subplot(1,3,1)
plt.plot(model_history.history['val_loss'])
plt.plot(model_history.history['loss'])
plt.title("Fitting history: LOSS")
plt.ylabel('Loss')
plt.xlabel('Epoch')
plt.legend(['Validation', 'Train'], loc = 'upper right')
plt.subplot(1,3,2)
plt.plot(model_history.history['val_accuracy'])
plt.plot(model_history.history['accuracy'])
plt.title("Fitting history: ACCURACY")
plt.ylabel('Accuracy')
plt.xlabel('Epoch')
plt.legend(['Validation', 'Train'], loc = 'upper left')
plt.subplot(1,3,3)
plt.plot(model_history.history['val_mean_iou'])
plt.plot(model_history.history['mean_iou'])
plt.title("Fitting history: MEAN IOU")
plt.ylabel('Mean IoU')
plt.xlabel('Epoch')
plt.legend(['Validation', 'Train'], loc = 'upper left')
plt.show()
# sauvegarde
model.save('model_optim.h5')
# chargement
model_optim = load_model('model_optim.h5')
# prédictions
pred_masks_optim = model_optim.predict(test_images)
# convertir les prédictions en masques de classe (chaque pixel prend la valeur de la classe avec la probabilité la plus élevée)
pred_masks_optim = np.argmax(pred_masks_optim, axis=-1)
4/4 ━━━━━━━━━━━━━━━━━━━━ 9s 2s/step
np.unique(pred_masks_optim)
array([0, 1, 2, 3, 4, 5, 6, 7], dtype=int64)
# visu
num_examples = 3
for i in range(num_examples):
plt.figure(figsize=(15, 5))
plt.subplot(1, 3, 1)
plt.imshow(test_images[i])
plt.title('Input Image')
plt.axis('off')
plt.subplot(1, 3, 2)
plt.imshow(remapped_test_masks[i])
plt.title('True Mask')
plt.axis('off')
plt.subplot(1, 3, 3)
plt.imshow(pred_masks_optim[i])
plt.title('Predicted Mask')
plt.axis('off')
plt.show()
# IoU pour chaque classe
num_classes = 8
ious = [] # ious pour chaque classe pour chaque paire de masques
for true_mask, pred_mask in zip(remapped_test_masks, pred_masks_optim):
ious.append(iou_metric(true_mask, pred_masks_optim, num_classes))
# IoU moyen
mean_iou_per_class = np.mean(ious, axis=0)
mean_iou = np.mean(mean_iou_per_class)
mean_iou.round(2)
0.24
mean_iou_per_class.round(2)
array([0.56, 0.69, 0.32, 0.01, 0.16, 0.09, 0.01, 0.08])
plt.figure(figsize=(10, 6))
bars = plt.bar(class_names, mean_iou_per_class, color='lightblue', edgecolor='grey')
for bar, value in zip(bars, mean_iou_per_class):
plt.text(bar.get_x() + bar.get_width() / 2, bar.get_height() + 0.01, f'{value:.2f}', ha='center', va='bottom')
plt.title('Mean IoU per Class')
plt.xlabel('Class')
plt.ylabel('Mean IoU')
plt.ylim(0, 1) # Limiter l'axe des y de 0 à 1
plt.grid(axis='y', linestyle='--', color='lightgrey')
plt.tight_layout()
plt.show()
# aplatir les masques pour obtenir une liste d'étiquettes de pixels
y_true = remapped_test_masks.flatten()
y_pred = pred_masks_optim.flatten()
conf_matrix = confusion_matrix(y_true, y_pred)
plt.figure(figsize=(10, 8))
sns.heatmap(conf_matrix, annot=True, fmt='d', cmap='Blues')
plt.xlabel('Predicted Labels')
plt.ylabel('True Labels')
plt.title('Confusion Matrix')
plt.show()
class_report = classification_report(y_true, y_pred)
print("Classification Report:\n")
print(class_report)
Classification Report:
precision recall f1-score support
0 0.93 0.49 0.64 197494
1 0.86 0.96 0.91 739743
2 0.81 0.82 0.82 511198
3 0.56 0.04 0.08 53572
4 0.74 0.89 0.81 273454
5 0.83 0.85 0.84 46727
6 0.58 0.12 0.19 24347
7 0.69 0.82 0.75 103161
accuracy 0.82 1949696
macro avg 0.75 0.62 0.63 1949696
weighted avg 0.82 0.82 0.80 1949696
Nous pouvons aller plus loin, en optimisant le modèle et cela en appliquant plusieurs méthodes : Hyperopt, Optuna...
L'optimisation avec Optuna n'a pas permis d'améliorer les performances du modèle, elles sont même légèrement inférieures à celles obtenues avec le modèle de base. Il est à noter que nous n'avons fait varier, pour des raisons de manque de ressources en calcul, que les paramètres n_filters et learning_rate (et même avec cela, la pipeline a tourné 8 heures). Un axe d'amélioration serait d'augmenter la pipeline avec plus d'hyperparamètres à tester, et ce sur des plages plus larges.
Par ailleurs, nous avons observé au cours de ce travail que l'augmentation progressive du nombre de villes dans les données d'entraînement a permis d'améliorer considérablement le modèle ainsi que de le rendre plus stable. Un autre axe d'amélioration serait d'augmenter encore le nombre de données d'entrée, voire, si cela était possible, d'utiliser l'entièreté des données.
Une autre remarque concerne enfin les limites d'un modèle de segmentation sémantique d'images pour un cas d'usage de conduite autonome. En effet, nous avons ici utilisé 8 catégories d'objets potentiels, le dataset en contient 34. Mais en réalité, une scène visuelle en conduite automobile admet un nombre beaucoup plus important de classes.
Pour aller plus loin, nous avons conçu une application Flask permettant à un utilisateur de charger une image ou une vidéo, et d'appliquer notre modèle en temps réel (cf. fichiers dédiés).
MERCI MADAME
import os
from IPython.display import display, Image
# define the folder containing images
image_folder = 'images'
# get a list of all image files in the folder
image_files = [f for f in os.listdir(image_folder) if f.endswith(('.png', '.jpg', '.jpeg', '.gif', '.bmp', '.tiff'))]
# display each image
for image_file in image_files:
image_path = os.path.join(image_folder, image_file)
display(Image(filename=image_path))